Connect to AWS Athena Database in a Mule Application with Generic Database
This blog helps understand the steps to configure the Mule application to use generic database configuration to connect to Amazon Athena using the AWS Athena JDBC Driver.
API, Mulesoft, REST API, Athena, AWS Athena, JDBC, Database, DB
Mulesoft | Anupam Chakraborty | Jul 25 2023 11:28 PM
Introduction
This would be a short tutorial on connecting from MuleSoft to AWS Athena Database. We will use Generic Database Driver.
Please read through the following resources for further information before getting started in case.
- Introduction to MuleSoft Anypoint Studio
- Introduction to building API using MuleSoft
- MuleSoft Database Connector
- MuleSoft Database Connector: Configuring a generic connection
- Connect Amazon Athena with JDBC
However, we will get started step by step and will learn through each process. We assume that we have the following information before getting started:
- Athena S3 Output Location: <S3OutputLocation>
- AWS Region: <AwsRegion>
- UserName: <User>
- Password: <Password>
Get Started
We will start by creating a Project in Any point MuleSoft Platform. If you want to understand how to do this, please check my blog on Introduction to building API using MuleSoft.
Once we have created the Application, the first step is to Update our POM with the necessary JAR. However, I was not able to find a MAVEN Repository having the necessary JAR Files available to be used. However, I was able to get the jar files.
The jar files can be available for download here in my GitHub space.
Alternatively, you can download the jar file from the AWS Documentation Connect Amazon Athena with JDBC
In my case, I am using the version 2.1.0.1000, the JDBC driver with AWS SDK.
Once the corresponding jar is downloaded, we need to upload the same to the artifactory that we use. In my case, I will simply upload this to my Maven Repository using the following command.
mvn install:install-file \
-Dfile=<path-to-file> \
-DgroupId=<group-id> \
-DartifactId=<artifact-id> \
-Dversion=<version> \
-Dpackaging=<packaging> \
-DgeneratePom=true
In my case
mvn install:install-file -Dfile=<path-to-file> -DgroupId=aws.athena -DartifactId=AthenaJDBC42 -Dversion=2.1.0.1000
This is how my .m2/aws directory will look like.
I would overwrite the pom file that I have downloaded from the gitlab.
Set up my application POM File
Once completed, we must update my POM File at 2 places:
- Add the dependency to the AthenaJDBC42 jar
<dependency>
<groupId>aws.athena</groupId>
<artifactId>AthenaJDBC42</artifactId>
<version>2.1.0.1000</version>
</dependency>
- Add the shared library configuration.
<plugin>
<groupId>org.mule.tools.maven</groupId>
<artifactId>mule-maven-plugin</artifactId>
<version>${mule.maven.plugin.version}</version>
<extensions>true</extensions>
<configuration>
<sharedLibraries>
<sharedLibrary>
<groupId>aws.athena</groupId>
<artifactId>AthenaJDBC42</artifactId>
</sharedLibrary>
</sharedLibraries>
<classifier>mule-application</classifier>
</configuration>
</plugin>
Starting the code
Finally, I would have to start my code. We will head back to the actual xml and start by creating a flow consisting of the following in order:
- HTTP Listener
- Logger
- Database Select
- Logger
- Map to convert output to JSON
The Flow should look something like this.
Here is the code that I have in my XML
<?xml version="1.0"encoding="UTF-8"?>
<mule xmlns:ee="http://www.mulesoft.org/schema/mule/ee/core"xmlns:db="http://www.mulesoft.org/schema/mule/db"
xmlns:http="http://www.mulesoft.org/schema/mule/http"
xmlns="http://www.mulesoft.org/schema/mule/core"xmlns:doc="http://www.mulesoft.org/schema/mule/documentation"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/current/mule.xsd
http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/current/mule-http.xsd
http://www.mulesoft.org/schema/mule/db http://www.mulesoft.org/schema/mule/db/current/mule-db.xsd
http://www.mulesoft.org/schema/mule/ee/core http://www.mulesoft.org/schema/mule/ee/core/current/mule-ee.xsd">
<http:listener-config name="HTTP_Listener_config"doc:name="HTTP Listener config">
<http:listener-connection host="0.0.0.0"port="8081" />
</http:listener-config>
<db:config name="Athena_Database_Config">
<db:generic-connection
url="jdbc:awsathena://User=${athena.User};Password=${athena.Password};AwsRegion=${athena.AwsRegion};S3OutputLocation=${athena.S3OutputLocation}"
driverClassName="com.simba.athena.jdbc.Driver"/>
</db:config>
<flow name="test-athena-dbFlow"doc:id="cc20b26a-4cc0-4a81-b66a-bf97e310858a">
<http:listener doc:name="Listener"doc:id="d6463d2c-4895-43e7-a5f8-ae4e261cf187"config-ref="HTTP_Listener_config" path="/testAthena"/>
<logger level="INFO"doc:name="Logger" doc:id="21cb94b1-a3cc-436a-a97b-e362c2ab8989"message="Request Received"/>
<db:select doc:name="Select"doc:id="6210b6c3-bc8e-4e7d-87f1-dafa888fdf45"config-ref="Athena_Database_Config">
<db:sql ><![CDATA[${athena.sqlQuery}]]></db:sql>
</db:select>
<logger level="INFO"doc:name="Logger" doc:id="c17e9502-e118-4a06-a33f-ccc816fcd7b5"message="Database Connection Complete"/>
<ee:transform doc:name="Transform Message" doc:id="7de4dceb-e124-41f6-a51a-6718ae2bc6b5">
<ee:message >
<ee:set-payload ><![CDATA[%dw 2.0
output application/json
---
payload]]></ee:set-payload>
</ee:message>
</ee:transform>
</flow>
</mule>
Testing my application
Finally, I will run my application in MuleSoft.
Once application is Deployed, Up and kicking, we can call the API from Postman:
Conclusion
This blog helps understand the steps to configure the Mule application to use generic database configuration to connect to Amazon Athena using the AWS Athena JDBC Driver. Please let me know your thoughts. The Idea was to connect to the database. Of course, we should have proper error handling, HTTPs security etc. in the Application.
Comments:
Leave a Comment:
Please login to post comment into this page.