The current exasol connector (2.1.6) ist not compatible with the latest exasol jdbc driver (24.1.1)
Context
We have a Spark data processing pipeline that uses the Exasol connector version 1.4 and a JDBC driver version 7. Everything was working fine, so we decided to update our dependencies to the above mentioned versions, as our previous setup was quite outdated (404 update policy not found).
However, after running somthing like this:
val df = spark
.read
.format("exasol")
.option("host", "10.210.21.191")
.option("port", "8563")
[etc]
We ran into the Error:
24/07/15 16:29:26 INFO ExasolRDD: Sub connection with url = jdbc:exa-worker:<IP>/<FINGERPRING>:20431;workerID=0;workertoken=7362057731359728037;debug=1 and handle = 1
24/07/15 16:29:26 DEBUG ExasolConnectionManager: Making a connection using url = jdbc:exa-worker:<IP>/<FINGERPRING>:20431;workerID=0;workertoken=7362057731359728037;debug=1
24/07/15 16:29:26 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.sql.SQLException: [ERROR] Connection String does not support (workerID) argument.
But Why?
Looking at the Exasol spark-connector, we can see, how the connection string is created.
val url = s"$WORKER_CONNECTION_PREFIX:$hostWithFingerprint:$port;workerID=$idx;workertoken=$token"
The Exasol Connector creates an Exasol client object to retrieve all necessary information about the Exasol cluster, including the IP of each node and the worker token. Specifically pay attention to the creation of the worker ID as it is just an index for the workers.
However, according to the Exasol-JDBC documentation, there is no driver property called “workerid.” This explains the error stating that the connection string does not support that argument.
But why did it work in the previous setup?
Explanation
Validating the arguments of the connection string is a recent feature added by the JDBC driver (changelog). This validation was not present in earlier versions, which is why the previous setup worked. Consequently, as of the writing of this post, the recent Exasol JDBC version is not compatible with the Spark Exasol Connector anymore, as the connector requires an argument that cannot be validated by the JDBC driver.
The current version of the JDBC driver was released in March 2024, so this is just a recent change. Given the active nature of the Exasol Spark Connector open-source project, it is likely just a matter of time before this issue is fixed.