Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
288 views
in Technique[技术] by (71.8m points)

ActiveMQ Artemis, Connections Accumulating

Is there a way to cause stale connections to time out in ActiveMQ Artemis? I have a situation where the connections are accumulating and then I get the "newSocketStream(..) failed: Too many open files" error, which I think is due to the connections.

How should I diagnose this problem?

2021-01-28 01:20:39,492 WARN  [io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.: io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many open files

2021-01-28 01:20:39,656 WARN  [io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.: io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many open files

2021-01-28 01:20:39,937 ERROR [org.apache.activemq.artemis.core.client] AMQ214016: Failed to create netty connection: io.netty.channel.ChannelException: Unable to create Channel from class class io.netty.channel.epoll.EpollSocketChannel
    at io.netty.channel.ReflectiveChannelFactory.newChannel(ReflectiveChannelFactory.java:46) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    at io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:310) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    at io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:155) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    at io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:139) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    at org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector.createConnection(NettyConnector.java:818) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector.createConnection(NettyConnector.java:785) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.openTransportConnection(ClientSessionFactoryImpl.java:1076) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.createTransportConnection(ClientSessionFactoryImpl.java:1125) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.establishNewConnection(ClientSessionFactoryImpl.java:1336) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.getConnection(ClientSessionFactoryImpl.java:931) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.getConnectionWithRetry(ClientSessionFactoryImpl.java:820) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.connect(ClientSessionFactoryImpl.java:252) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.connect(ClientSessionFactoryImpl.java:268) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl$StaticConnector$Connector.tryConnect(ServerLocatorImpl.java:1813) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl$StaticConnector.connect(ServerLocatorImpl.java:1682) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:536) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:524) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl$4.run(ServerLocatorImpl.java:482) [artemis-core-client-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.14.0.jar:2.14.0]
    at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.14.0.jar:2.14.0]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [java.base:]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [java.base:]
    at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.14.0.jar:2.14.0]
Caused by: java.lang.reflect.InvocationTargetException
    at jdk.internal.reflect.GeneratedConstructorAccessor17.newInstance(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [java.base:]
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) [java.base:]
    at io.netty.channel.ReflectiveChannelFactory.newChannel(ReflectiveChannelFactory.java:44) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    ... 23 more
Caused by: io.netty.channel.ChannelException: io.netty.channel.unix.Errors$NativeIoException: newSocketStream(..) failed: Too many open files
    at io.netty.channel.unix.Socket.newSocketStream0(Socket.java:421) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    at io.netty.channel.epoll.LinuxSocket.newSocketStream(LinuxSocket.java:319) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    at io.netty.channel.epoll.LinuxSocket.newSocketStream(LinuxSocket.java:323) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    at io.netty.channel.epoll.EpollSocketChannel.<init>(EpollSocketChannel.java:45) [netty-all-4.1.48.Final.jar:4.1.48.Final]
    ... 27 more
Caused by: io.netty.channel.unix.Errors$NativeIoException: newSocketStream(..) failed: Too many open files

This problem looks similar: SocketException : TOO MANY OPEN FILES

As for my use case, I'm receiving orders from a website and processing them into an ERP, then transmitting status back to the website and other systems. Sending messages back to the website API is a bit slow, and near the time of the incident there was maybe 700 messages queued.

The website uses AMQP and my message routing is down with JMS.

Here is the ulimit for the user that runs the broker.

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63805
max locked memory       (kbytes, -l) 16384
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63805
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

My JVM memory setting: -Xms1024M -Xmx8G

And here is my broker.xml

<configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">

   <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="urn:activemq:core ">

      <name>0.0.0.0</name>
      <persistence-enabled>true</persistence-enabled>
      <journal-type>NIO</journal-type>
      <paging-directory>/nfs/amqprod/data/paging</paging-directory>
      <bindings-directory>/nfs/amqprod/data/bindings</bindings-directory>
      <journal-directory>/nfs/amqprod/data/journal</journal-directory>
      <large-messages-directory>/nfs/amqprod/data/large-messages</large-messages-directory>
      <journal-datasync>true</journal-datasync>
      <journal-min-files>2</journal-min-files>
      <journal-pool-files>10</journal-pool-files>
      <journal-device-block-size>4096</journal-device-block-size>
      <journal-file-size>10M</journal-file-size>
      <journal-buffer-timeout>2628000</journal-buffer-timeout>
      <journal-max-io>1</journal-max-io>
      <disk-scan-period>5000</disk-scan-period>
      <max-disk-usage>90</max-disk-usage>
      <critical-analyzer>true</critical-analyzer>
      <critical-analyzer-timeout>120000</critical-analyzer-timeout>
      <critical-analyzer-check-period>60000</critical-analyzer-check-period>
      <critical-analyzer-policy>HALT</critical-analyzer-policy>
      <page-sync-timeout>2628000</page-sync-timeout>
      <jmx-management-enabled>true</jmx-management-enabled>
      <global-max-size>2G</global-max-size>

      <acceptors>

<!-- keystores will be found automatically if they are on the classpath -->
         <acceptor name="netty-ssl-acceptor">tcp://0.0.0.0:5500?sslEnabled=true;keyStorePath={path}/keystore.ks;keyStorePassword={pasword};protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE</acceptor>

         <!-- Acceptor for every supported protocol -->
         <acceptor name="artemis">tcp://0.0.0.0:61616?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;amqpMinLargeMessageSize=102400;protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpDuplicateDetection=true</acceptor>


      </acceptors>

      <!-- HA -->
      <connectors>
        <connector name="artemis">tcp://{Primary IP}:61616</connector>
        <connector name="artemis-backup">tcp://{Secondary IP}:61616</connector>
      </connectors>

      <cluster-user>activemq</cluster-user>
      <cluster-password>{cluster password}</cluster-password>

      <ha-policy>
        <shared-store>
          <master>
            <failover-on-shutdown>true</failover-on-shutdown>
          </master>
        </shared-store>
      </ha-policy>

      <cluster-connections>
        <cluster-connection name="cluster-1">
          <connector-ref>artemis</connector-ref>
          <!--<discovery-group-ref discovery-group-name="discovery-group-1"/>-->
          <static-connectors>
            <connector-ref>artemis-backup</connector-ref>
          </static-connectors>
        </cluster-connection>
       </cluster-connections>
      <!-- HA -->

      <security-settings>
         <security-setting match="#">
            <permission type="createNonDurableQueue" roles="amq"/>
            <permission type="deleteNonDurableQueue" roles="amq"/>
            <permission type="createDurableQueue" roles="amq"/>
            <permission type="deleteDurableQueue" roles="amq"/>
            <permission type="createAddress" 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

ActiveMQ Artemis already enforces a default connection timeout of 60 seconds for any AMQP client using an acceptor where amqpIdleTimeout is not set. See the documentation for more details on that. Therefore any "stale" connection should be removed in 60 seconds and you'll see log messages indicating that a connection was cleaned up.

It's worth noting that in lieu of network problems which interrupt connections the most common cause of stale connections is poorly written clients which do not manage their resources properly.

In general, I think a ulimit of 1024 for open files is quite low for a modern system. I recommend you raise this substantially.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...