[Bug] [S3File] [zeta-local] Error writing to S3File in version 2.3.4:: Java lang. An IllegalStateException: Connection pool shut down #6678

LeonYoah · 2024-04-10T06:28:29Z

Search before asking

I had searched in the issues and found no similar issues.

What happened

When jdbc- > s3File is executed using the local mode, it occurs sporadically, good and bad in most cases, but there is no problem with using the cluster mode:

I probably searched for this error about using aws-sdk: Connection pool shut down, the related issues is awslabs/amazon-sqs-java-messaging-lib#96.
It mentions the problem of multithreaded connection pooling, so I initially guess whether it is possible that there is a [aggregate commit] operation after [S3file] executes the [sink] operation, but the [sink] operation executes the close method, which causes the [rename] of [commit] to report an error, indicating that the connection pool is closed, but why the lcoal mode is good and bad, the cluster mode is not a problem.
And I've tried the 2.3.3 local mode, and that's not the case.
error.txt

SeaTunnel Version

2.3.4

SeaTunnel Config

env {
  execution.parallelism = 1
  job.mode = "BATCH"
}

source {
	Jdbc {
		result_table_name = "TEST_100W"
		query = "select * from SYSDBA.TEST_100W limit 10"
		fetch_size = 5000
		table_path = "SYSDBA.TEST_100W"
		driver = "com.mysql.cj.jdbc.Driver"
		url = "jdbc:mysql://10.28.23.xxx:3306/test"
		user = "test"
		password ="xxx"
	} 
}
sink {
  S3File {
    source_table_name = "aa"
    path = "/xugurtp/seatunnel/tmp/6af80b38f3434aceb573cc65b9cd12216a/3918"
    bucket = "s3a://xugurtp"
    fs.s3a.endpoint = "http://10.28.23.xxx:9010"
    fs.s3a.aws.credentials.provider = "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
    access_key = "xxxx"
    secret_key = "xxxxxxx"
    custom_filename = true
    file_name_expression = "output_params"
    file_format_type = "json"
    is_enable_transaction = false
    ##sink_columns = ["ID"]
  }
}

Running Command

./bin/seatunnel.sh  -e local --config job/s3_sink.conf

Error Exception

2024-04-10 14:02:28,387 ERROR [.c.FileSinkAggregatedCommitter] [hz.main.generic-operation.thread-43] - commit aggregatedCommitInfo error, aggregatedCommitInfo = FileAggregatedCommitInfo(transactionMap={/tmp/seatunnel/seatunnel/830321799827816449/3eff006528/T_830321799827816449_3eff006528_0_1={/tmp/seatunnel/seatunnel/830321799827816449/3eff006528/T_830321799827816449_3eff006528_0_1/NON_PARTITION/output_params_0.json=/xugurtp/seatunnel/tmp/6af80b38f3434aceb573cc65b9cd12216a/3918/output_params_0.json}}, partitionDirAndValuesMap={}) 
java.lang.IllegalStateException: Connection pool shut down
        at com.amazonaws.thirdparty.apache.http.util.Asserts.check(Asserts.java:34) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.thirdparty.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:184) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.requestConnection(PoolingHttpClientConnectionManager.java:251) ~[aws-java-sdk-bundle-1.11.271.jar:?]

Zeta or Flink or Spark Version

local模式

Java or Scala Version

1.8

Screenshots

No response

Are you willing to submit PR?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

LeonYoah · 2024-04-10T06:48:03Z

In addition, there's nothing wrong with debugging with idea, but that's what happens on the server.

LeonYoah · 2024-04-15T05:10:40Z

@ruanwenjun may have something to do with your issue #5903, which is whether checkpoint uses hdfs or cache. I saw that you submitted #6039 and noticed the cache problem, but checkpoint did not: 在这个类；org.apache.seatunnel.engine.checkpoint.storage.hdfs.common.HdfsConfiguration

LeonYoah · 2024-04-15T05:22:20Z

And this code:

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached

LeonYoah · 2024-04-15T07:40:38Z

s3n is because the S3Conf obtained from AggregatedCommit does not use this buildWithConfig method, but uses DEFAULT_SCHEMA. debug finds that it seems to be related to dag:

buildWithConfig method screenshot：

DAG initialization S3conf screenshot：

AggregatedCommit获取的hadoopconf和shema截图：

So far I have found two solutions:

1. Change DEFAULT_SCHEMA to s3a：

2. Set up the profile：

_ But I am not familiar with the code of dag and AggregatedCommint, I need to ask my teacher to help me look at the root cause! _

LeonYoah · 2024-04-15T13:46:20Z

@EricJoy2048 I see that you have been working on the mutli table feature #6698 of S3file connector recently. Have you ever encountered that the schema in hadoop conf obtained by mutli table is the default s3n instead of the s3a specified in the configuration file

EricJoy2048 · 2024-04-16T02:32:02Z

I'll look at that as soon as I can

EricJoy2048 · 2024-04-16T03:35:34Z

SeaTunnel Engine(Zeta) will use hdfs api to write checkpoint to FileSystem, The config is at $SEATUNNEL_HOME/conf/seatunnel.yaml, And the codes are in seatunnel-engine/seatunnel-engine-storage/checkpoint-storage-plugins/checkpoint-storage-hdfs/src/main/java/org/apache/seatunnel/engine/checkpoint/storage/hdfs/common package . I do not know the content about this config file in your environment, But I think we need disable the cache in checkpoint storage too.

EricJoy2048 · 2024-04-16T03:41:03Z

I can show you how the FileSinkAggregatedCommitter init.

EricJoy2048 · 2024-04-16T03:43:17Z

EricJoy2048 · 2024-04-16T03:49:03Z

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached Is this reproducible?

LeonYoah · 2024-04-16T03:51:52Z

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached Is this reproducible?#00000;是可复制的？

Yes, I am now 100% repeat, I officially downloaded a seatunnel2.3.4 package and plug-in, deployed on the server and executed in local mode, there will be such a problem, but local idea debugging is not possible, it cannot appear

EricJoy2048 · 2024-04-16T03:52:46Z

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached Is this reproducible?

Can you try use remote debug?

LeonYoah · 2024-04-16T03:53:41Z

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached Is this reproducible?#00000;是可复制的？

Can you try use remote debug?你能尝试使用远程调试吗？

Yes, I discovered through remote debugging that s3n was actually being passed

EricJoy2048 · 2024-04-16T04:02:06Z

我注意到这块代码其实是没问题的，通过执行buildwith方法会赋值shema为s3a，但是关键在于整个sink往下游传递给mutiltable sink时，会执行反序列化操作

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached Is this reproducible?#00000;是可复制的？

Can you try use remote debug?你能尝试使用远程调试吗？

Yes, I discovered through remote debugging that s3n was actually being passed

I understand now. It's really a matter of serialization and deserialization of static variables. I don't think we should define static variables for classes that need to be serialized and deserialized. Can you put up a pr to fix this?

LeonYoah · 2024-04-16T04:02:47Z

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached Is this reproducible?#00000;是可复制的？

Can you try use remote debug?你能尝试使用远程调试吗？

I seem to have spotted the problem, noting that this code is actually fine, by executing the buildWithConfig method it assigns [SCHEMA] to "s3a", but the key is that when the whole [sink] is passed downstream to [multiTableSink], it is de-serialized, The deserialization process re-instantiates the static variables, and the member variables of the entire S3CONF class are [stastic] modified, including [SCHEMA], resulting in [mutiltable sink] getting the default value of [SCHEMA], which is "s3n".

LeonYoah · 2024-04-16T04:07:08Z

我注意到这块代码其实是没问题的，通过执行buildwith方法会赋值shema为s3a，但是关键在于整个sink往下游传递给mutiltable sink时，会执行反序列化操作

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached Is this reproducible?#00000;是可复制的？

Can you try use remote debug?你能尝试使用远程调试吗？

Yes, I discovered through remote debugging that s3n was actually being passed 是的，我通过远程调试发现s3n实际上正在传递

I understand now. It's really a matter of serialization and deserialization of static variables. I don't think we should define static variables for classes that need to be serialized and deserialized. Can you put up a pr to fix this?我现在明白了这实际上是 static variables 的序列化和重复化的问题。我不认为我们应该为需要序列化和非序列化的类定义 static variables 。你能做个公关来解决这个问题吗？

Ok, I am willing to submit the PR, I have modified the first version to try to remove the [static] changes, but I see that you submitted a change to the s3 connector, maybe my submission caused a conflict

EricJoy2048 · 2024-04-16T05:33:58Z

我注意到这块代码其实是没问题的，通过执行buildwith方法会赋值shema为s3a，但是关键在于整个sink往下游传递给mutiltable sink时，会执行反序列化操作

Sometimes hadoopConf's getSchema method returns s3n instead of s3a, resulting in the s3 connector actually being cached Is this reproducible?#00000;是可复制的？

Can you try use remote debug?你能尝试使用远程调试吗？

Yes, I discovered through remote debugging that s3n was actually being passed 是的，我通过远程调试发现s3n实际上正在传递

I understand now. It's really a matter of serialization and deserialization of static variables. I don't think we should define static variables for classes that need to be serialized and deserialized. Can you put up a pr to fix this?我现在明白了这实际上是 static variables 的序列化和重复化的问题。我不认为我们应该为需要序列化和非序列化的类定义 static variables 。你能做个公关来解决这个问题吗？

Ok, I am willing to submit the PR, I have modified the first version to try to remove the [static] changes, but I see that you submitted a change to the s3 connector, maybe my submission caused a conflict

Don't worry, I will resolve the conflict after your pr merge.

EricJoy2048 · 2024-04-16T06:29:50Z

Related PR: #6698

EricJoy2048 · 2024-04-16T06:31:27Z

You can add close #6678 to the description of pr in pr to ensure that the issue is automatically closed when pr merges

LeonYoah · 2024-04-16T06:44:57Z

You can add close #6678 to the description of pr in pr to ensure that the issue is automatically closed when pr merges
Ok

LeonYoah added the bug label Apr 10, 2024

LeonYoah changed the title ~~[Bug] [S3File] [zeta-local] There is an occasional problem with S3File writing. An error was reported~~ [Bug] [S3File] [zeta-local] Write times wrong version 2.3.4 S3File: Java lang. An IllegalStateException: Connection pool shut down Apr 15, 2024

hailin0 added the help wanted label Apr 16, 2024

EricJoy2048 assigned LeonYoah Apr 16, 2024

LeonYoah closed this as completed Apr 16, 2024

EricJoy2048 reopened this Apr 16, 2024

LeonYoah mentioned this issue Apr 16, 2024

[bigfix][S3 File]:Change the [SCHEMA] attribute of the [S3CONF class] to be non-static to avoid being reassigned after deserialization #6717

Merged

4 tasks

hailin0 closed this as completed in #6717 Apr 17, 2024

LeonYoah mentioned this issue Apr 27, 2024

[Improve][Zeta] Disable hdfs filesystem cache of checkpoint #6718

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] [S3File] [zeta-local] Error writing to S3File in version 2.3.4:: Java lang. An IllegalStateException: Connection pool shut down #6678

[Bug] [S3File] [zeta-local] Error writing to S3File in version 2.3.4:: Java lang. An IllegalStateException: Connection pool shut down #6678

LeonYoah commented Apr 10, 2024

LeonYoah commented Apr 10, 2024

LeonYoah commented Apr 15, 2024

LeonYoah commented Apr 15, 2024

LeonYoah commented Apr 15, 2024 •

edited

LeonYoah commented Apr 15, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

LeonYoah commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

LeonYoah commented Apr 16, 2024 •

edited

EricJoy2048 commented Apr 16, 2024

LeonYoah commented Apr 16, 2024

LeonYoah commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

LeonYoah commented Apr 16, 2024

[Bug] [S3File] [zeta-local] Error writing to S3File in version 2.3.4:: Java lang. An IllegalStateException: Connection pool shut down #6678

[Bug] [S3File] [zeta-local] Error writing to S3File in version 2.3.4:: Java lang. An IllegalStateException: Connection pool shut down #6678

Comments

LeonYoah commented Apr 10, 2024

Search before asking

What happened

SeaTunnel Version

SeaTunnel Config

Running Command

Error Exception

Zeta or Flink or Spark Version

Java or Scala Version

Screenshots

Are you willing to submit PR?

Code of Conduct

LeonYoah commented Apr 10, 2024

LeonYoah commented Apr 15, 2024

LeonYoah commented Apr 15, 2024

LeonYoah commented Apr 15, 2024 • edited

s3n is because the S3Conf obtained from AggregatedCommit does not use this buildWithConfig method, but uses DEFAULT_SCHEMA. debug finds that it seems to be related to dag:

buildWithConfig method screenshot：

DAG initialization S3conf screenshot：

AggregatedCommit获取的hadoopconf和shema截图：

So far I have found two solutions:

1. Change DEFAULT_SCHEMA to s3a：

2. Set up the profile：

LeonYoah commented Apr 15, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

LeonYoah commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

LeonYoah commented Apr 16, 2024 • edited

EricJoy2048 commented Apr 16, 2024

LeonYoah commented Apr 16, 2024

LeonYoah commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

EricJoy2048 commented Apr 16, 2024

LeonYoah commented Apr 16, 2024

LeonYoah commented Apr 15, 2024 •

edited

LeonYoah commented Apr 16, 2024 •

edited