Skip to content

[ZEPPELIN-6204] do not allow init params in JDBC URLs for H2 #4949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
8 changes: 8 additions & 0 deletions jdbc/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
<commons.dbcp2.version>2.0.1</commons.dbcp2.version>
<hive3.version>3.1.3</hive3.version>
<kyuubi.version>1.9.1</kyuubi.version>
<mysql.connector.version>9.3.0</mysql.connector.version>

<!--test library versions-->
<mockrunner.jdbc.version>1.0.8</mockrunner.jdbc.version>
Expand All @@ -59,6 +60,13 @@
<scope>test</scope>
</dependency>

<dependency>
<groupId>com.mysql</groupId>
<artifactId>mysql-connector-j</artifactId>
<version>${mysql.connector.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
Expand Down
133 changes: 125 additions & 8 deletions jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java
Original file line number Diff line number Diff line change
Expand Up @@ -156,14 +156,22 @@ public class JDBCInterpreter extends KerberosInterpreter {
"KerberosConfigPath", "KerberosKeytabPath", "KerberosCredentialCachePath",
"extraCredentials", "roles", "sessionProperties"));

private static final String ALLOW_LOAD_LOCAL = "allowLoadLocal";

private static final String ALLOW_LOAD_LOCAL_IN_FILE_NAME = "allowLoadLocalInfile";

private static final String AUTO_DESERIALIZE = "autoDeserialize";
private static final String ALLOW_LOAD_LOCAL_IN_FILE_IN_PATH = "allowLoadLocalInfileInPath";

private static final String ALLOW_LOCAL_IN_FILE_NAME = "allowLocalInfile";

private static final String ALLOW_URL_IN_LOCAL_IN_FILE_NAME = "allowUrlInLocalInfile";

private static final String AUTO_DESERIALIZE = "autoDeserialize";

private static final String SOCKET_FACTORY = "socketFactory";

private static final String INIT = "INIT";

Comment on lines +159 to +174
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be nice to group these into a Set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to stick with the existing code style.

// database --> Properties
private final HashMap<String, Properties> basePropertiesMap;
// username --> User Configuration
Expand Down Expand Up @@ -588,18 +596,127 @@ public Connection getConnection(InterpreterContext context)
return connection;
}

private void validateConnectionUrl(String url) {
String decodedUrl;
decodedUrl = URLDecoder.decode(url, StandardCharsets.UTF_8);
// package private for testing purposes
static void validateConnectionUrl(String url) {
final String decodedUrl = urlDecode(url, url, 0);
final Map<String, String> params = parseUrlParameters(decodedUrl);

if (containsKeyIgnoreCase(params, ALLOW_LOAD_LOCAL) ||
containsKeyIgnoreCase(params, ALLOW_LOAD_LOCAL_IN_FILE_NAME) ||
containsKeyIgnoreCase(params, ALLOW_LOCAL_IN_FILE_NAME) ||
containsKeyIgnoreCase(params, ALLOW_URL_IN_LOCAL_IN_FILE_NAME) ||
containsKeyIgnoreCase(params, ALLOW_LOAD_LOCAL_IN_FILE_IN_PATH) ||
containsKeyIgnoreCase(params, AUTO_DESERIALIZE) ||
containsKeyIgnoreCase(params, SOCKET_FACTORY)) {
throw new IllegalArgumentException("Connection URL contains sensitive configuration");
}
Comment on lines +604 to +612
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (containsKeyIgnoreCase(params, ALLOW_LOAD_LOCAL) ||
containsKeyIgnoreCase(params, ALLOW_LOAD_LOCAL_IN_FILE_NAME) ||
containsKeyIgnoreCase(params, ALLOW_LOCAL_IN_FILE_NAME) ||
containsKeyIgnoreCase(params, ALLOW_URL_IN_LOCAL_IN_FILE_NAME) ||
containsKeyIgnoreCase(params, ALLOW_LOAD_LOCAL_IN_FILE_IN_PATH) ||
containsKeyIgnoreCase(params, AUTO_DESERIALIZE) ||
containsKeyIgnoreCase(params, SOCKET_FACTORY)) {
throw new IllegalArgumentException("Connection URL contains sensitive configuration");
}
Stream.of(ALLOW_LOAD_LOCAL, ALLOW_LOCAL_IN_FILE_NAME, ...).forEach(rule -> {
if (containsKeyIgnoreCase(params, rule)) throw new IllegalArgumentException("Connection URL contains sensitive configuration: " + rule);
});

I find this version easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't commit this unless it compiles so please don't use the suggestion option to suggest incomplete code.
I also disagree that this is more readable. The existing code in this PR is closer to the existing style in the existing codebase.


if (containsIgnoreCase(decodedUrl, ALLOW_LOAD_LOCAL_IN_FILE_NAME) ||
containsIgnoreCase(decodedUrl, AUTO_DESERIALIZE) ||
containsIgnoreCase(decodedUrl, ALLOW_LOCAL_IN_FILE_NAME) ||
containsIgnoreCase(decodedUrl, ALLOW_URL_IN_LOCAL_IN_FILE_NAME)) {
// the INIT parameter name is a bit generic so we check it only for H2
if (containsIgnoreCase(decodedUrl, "jdbc:h2") && containsKeyIgnoreCase(params, INIT)) {
throw new IllegalArgumentException("Connection URL contains sensitive configuration");
}
}

/**
* Decode the URL encoded string recursively until no more decoding is needed.
* This is to handle cases where the URL might be double-encoded.
*
* @param url the original URL (for logging purposes)
* @param encoded the URL encoded string
* @param recurseCount the current recursion depth
* @return the decoded string
* @throws IllegalArgumentException if the recursion depth exceeds 10
*/
private static String urlDecode(final String url,
final String encoded,
final int recurseCount) {
if (recurseCount > 10) {
throw new IllegalArgumentException("illegal URL encoding detected: " + url);
}
final String decoded = URLDecoder.decode(encoded, StandardCharsets.UTF_8);
if (decoded.equals(encoded)) {
return decoded; // No more decoding needed or max recursion reached
}
return urlDecode(url, decoded, recurseCount + 1);
}

private static Map<String, String> parseUrlParameters(final String url) {
final Map<String, String> parameters = new HashMap<>();

// MySQL supports parentheses in the URL
// https://dev.mysql.com/doc/connectors/en/connector-j-reference-jdbc-url-format.html
// eg jdbc:mysql://(host=myhost,port=1111,allowLoadLocalInfile=true)/db
int parensIndex = extractFromParens(url, 0, parameters);
while (parensIndex > 0) {
parensIndex = extractFromParens(url, parensIndex, parameters);
}

// Split the URL into the base part and the parameters part
String[] parts = url.split("[?&;]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be a bit faster if the regex was cached to avoid rebuilding every time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the cost of this regex will be tiny compared to the I/O cost of using JDBC commands so I'm not very interested in optimising every line of code here

if (parts.length > 1) {
// The first part is the base URL, so we start from the second part
for (int i = 1; i < parts.length; i++) {
splitNameValue(parts[i], parameters, true);
}
}
return parameters;
}

private static boolean containsKeyIgnoreCase(Map<String, String> map, String key) {
for (String k : map.keySet()) {
if (k.equalsIgnoreCase(key)) {
return true;
}
}
return false;
}

/**
* Extracts key-value pairs from parentheses in the input string.
* The expected format is "(key1=value1, key2=value2, ...)".
*
* @param input the input string containing parameters in parentheses
* @param initIndex the index to start searching for parentheses
* @param parameters the map to store extracted key-value pairs
* @return the index of the closing parenthesis or -1 if not found
*/
private static int extractFromParens(final String input,
final int initIndex,
final Map<String, String> parameters) {
final int startIndex = input.indexOf('(', initIndex);
if (startIndex == -1) {
return -1;
}
final int endIndex = input.indexOf(')', startIndex);
if (startIndex != -1 && endIndex != -1) {
String params = input.substring(startIndex + 1, endIndex);
String[] keyValuePairs = params.split(",");
for (String pair : keyValuePairs) {
splitNameValue(pair, parameters, false);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work well if parentheses are in the URL query parameters?
jdbc:mysql://.../db?password=abc(def)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code doesn't care about what the values are, it only cares about the param names - write some test cases if you are suspicious

Copy link
Contributor

@seung-00 seung-00 Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the following cases:

testBannedMySQLQueryParamWithValidParam("password=allowLoadLocalInfile"); // 1. fail
testBannedMySQLQueryParamWithValidParam("password=(allowLoadLocalInfile)"); // 2. pass

I might be missing something, but test case 2 seems unintended.
Part of the value allowLoadLocalInfile appears to be treated as a key due to the parentheses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks - I'll look into the issue - but in the end of the day, it's not a very likely password
and the existing checks before this PR, they just do a string match on the whole URL so that code wouldn't handle this either

return endIndex;
}

/**
* Splits a name-value pair and adds it to the parameters map.
* Handles cases where the value might be missing.
*
* @param nameValue the name-value pair as a string
* @param parameters the map to store the extracted key-value pair
* @param allowEmptyValue whether to allow empty values
*/
private static void splitNameValue(String nameValue, Map<String, String> parameters,
boolean allowEmptyValue) {
String[] keyValue = nameValue.split("=");
if (keyValue.length >= 2) {
parameters.put(keyValue[0].trim(), keyValue[1].trim());
} else if (allowEmptyValue) {
// Handle cases where there might not be a value
parameters.put(keyValue[0].trim(), "");
}
}

private String appendProxyUserToURL(String url, String user) {
StringBuilder connectionUrl = new StringBuilder(url);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,11 +58,7 @@
import static org.apache.zeppelin.jdbc.JDBCInterpreter.DEFAULT_URL;
import static org.apache.zeppelin.jdbc.JDBCInterpreter.DEFAULT_USER;
import static org.apache.zeppelin.jdbc.JDBCInterpreter.PRECODE_KEY_TEMPLATE;

import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.fail;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.*;


/**
Expand Down Expand Up @@ -748,33 +744,85 @@ void testSplitSqlQueryWithComments() throws IOException,
}

@Test
void testValidateConnectionUrl() throws IOException, InterpreterException {
Properties properties = new Properties();
properties.setProperty("default.driver", "org.h2.Driver");
properties.setProperty("default.url", getJdbcConnection() + ";allowLoadLocalInfile=true");
properties.setProperty("default.user", "");
properties.setProperty("default.password", "");
JDBCInterpreter jdbcInterpreter = new JDBCInterpreter(properties);
jdbcInterpreter.open();
InterpreterResult interpreterResult = jdbcInterpreter.interpret("SELECT 1", context);
assertEquals(InterpreterResult.Code.ERROR, interpreterResult.code());
assertEquals("Connection URL contains improper configuration",
interpreterResult.message().get(0).getData());
void testValidateConnectionUrlAllowLoadLocalInFile() throws IOException, InterpreterException {
testBannedMySQLQueryParam("allowLoadLocalInfile=true");
testBannedMySQLQueryParamWithValidParam("allowLoadLocalInfile=true");
}

@Test
void testValidateConnectionUrlAllowLoadLocal() throws IOException, InterpreterException {
testBannedMySQLQueryParam("allowLoadLocal=true");
}

@Test
void testValidateConnectionUrlSocketFactory() throws IOException, InterpreterException {
// it easier to unit test with H2 but this is really a Postgres issue
testBannedH2QueryParam("socketFactory=com.example.MySocketFactory");
}

@Test
void testValidateConnectionUrlEncoded() throws IOException, InterpreterException {
testBannedMySQLQueryParam("%61llowLoadLocalInfile=true");
// also test encoded param with %2561 (%25 is the encoded form of %)
testBannedMySQLQueryParam("%2561llowLoadLocalInfile=true");
}

@Test
void testValidateConnectionH2UrlWithInit() throws IOException, InterpreterException {
testBannedH2QueryParam("INIT=RUNSCRIPT FROM 'http://localhost/init.sql'");
}

@Test
void testValidateConnectionMySQLProps() throws IOException, InterpreterException {
testBannedURL("com.mysql.cj.jdbc.Driver",
"jdbc:mysql://(host=myhost,port=1111,allowLoadLocalInfile=true)/db");
}

@Test
void testValidateConnectionMySQLProps2() throws IOException, InterpreterException {
testBannedURL("com.mysql.cj.jdbc.Driver",
"jdbc:mysql://[(host=myhost,port=1111,allowLoadLocalInfile=true),(host=myhost2)]/db");
}

@Test
void testValidateConnectionMySQLProps3() throws IOException, InterpreterException {
testBannedURL("com.mysql.cj.jdbc.Driver",
"jdbc:mysql://address=(host=172.18.0.1)(port=3309)" +
"(%2561llowLoadLocalInfile=true),localhost:3306/test");
}

@Test
void testValidateConnectionMySQLWeirdPassword() throws IOException, InterpreterException {
// we strongly discourage putting passwords in the URL
assertDoesNotThrow(() -> JDBCInterpreter.validateConnectionUrl
("jdbc:mysql://localhost:3306/test?user=xyz&password=(allowLoadLocalInfile)"));
}

private void testBannedH2QueryParam(String param) throws IOException, InterpreterException {
testBannedURL("org.h2.Driver", getJdbcConnection() + ";" + param);
}

private void testBannedMySQLQueryParam(String param) throws IOException, InterpreterException {
testBannedURL("org.h2.Driver", "jdbc:mysql://localhost/test?" + param);
}

private void testBannedMySQLQueryParamWithValidParam(String param)
throws IOException, InterpreterException {
testBannedURL("org.h2.Driver", "jdbc:mysql://localhost/test?paranoid=true&" + param);
}

private void testBannedURL(String driver, String url) throws IOException, InterpreterException {
Properties properties = new Properties();
properties.setProperty("default.driver", "org.h2.Driver");
properties.setProperty("default.url", getJdbcConnection() + ";%61llowLoadLocalInfile=true");
properties.setProperty("default.driver", driver);
properties.setProperty("default.url", url);
properties.setProperty("default.user", "");
properties.setProperty("default.password", "");
JDBCInterpreter jdbcInterpreter = new JDBCInterpreter(properties);
jdbcInterpreter.open();
InterpreterResult interpreterResult = jdbcInterpreter.interpret("SELECT 1", context);
assertEquals(InterpreterResult.Code.ERROR, interpreterResult.code());
assertEquals("Connection URL contains improper configuration",
interpreterResult.message().get(0).getData());
interpreterResult.message().get(0).getData());
}

private InterpreterContext getInterpreterContext() {
Expand Down
Loading