How do you programmatically configure hazelcast for the multicast discovery mechanism?
Details:
The documentation only supplies an example for TCP/IP and is out-of-date: it uses Config.setPort(), which no longer exists.
My configuration looks like this, but discovery does not work (i.e. I get the output "Members: 1"
:
Config cfg = new Config();
NetworkConfig network = cfg.getNetworkConfig();
network.setPort(PORT_NUMBER);
JoinConfig join = network.getJoin();
join.getTcpIpConfig().setEnabled(false);
join.getAwsConfig().setEnabled(false);
join.getMulticastConfig().setEnabled(true);
join.getMulticastConfig().setMulticastGroup(MULTICAST_ADDRESS);
join.getMulticastConfig().setMulticastPort(PORT_NUMBER);
join.getMulticastConfig().setMulticastTimeoutSeconds(200);
HazelcastInstance instance = Hazelcast.newHazelcastInstance(cfg);
System.out.println("Members: "+hazelInst.getCluster().getMembers().size());
Update 1, taking asimarslan's answer into account
If I fumbled with the MulticastTimeout, I either get "Members: 1"
or
Dec 05, 2013 8:50:42 PM com.hazelcast.nio.ReadHandler WARNING:
[192.168.0.9]:4446 [dev] hz._hzInstance_1_dev.IO.thread-in-0 Closing
socket to endpoint Address[192.168.0.7]:4446,
Cause:java.io.EOFException: Remote socket closed! Dec 05, 2013 8:57:24
PM com.hazelcast.instance.Node SEVERE: [192.168.0.9]:4446 [dev] Could
not join cluster, shutting down!
com.hazelcast.core.HazelcastException: Failed to join in 300 seconds!
Update 2, taking pveentjer's answer about using tcp/ip into account
If I change the configuration to the following, I still only get 1 member:
Config cfg = new Config();
NetworkConfig network = cfg.getNetworkConfig();
network.setPort(PORT_NUMBER);
JoinConfig join = network.getJoin();
join.getMulticastConfig().setEnabled(false);
join.getTcpIpConfig().addMember("192.168.0.1").addMember("192.168.0.2").
addMember("192.168.0.3").addMember("192.168.0.4").
addMember("192.168.0.5").addMember("192.168.0.6").
addMember("192.168.0.7").addMember("192.168.0.8").
addMember("192.168.0.9").addMember("192.168.0.10").
addMember("192.168.0.11").setRequiredMember(null).setEnabled(true);
//this sets the allowed connections to the cluster? necessary for multicast, too?
network.getInterfaces().setEnabled(true).addInterface("192.168.0.*");
HazelcastInstance instance = Hazelcast.newHazelcastInstance(cfg);
System.out.println("debug: joined via "+join+" with "+hazelInst.getCluster()
.getMembers().size()+" members.");
More precisely, this run produces the output
debug: joined via JoinConfig{multicastConfig=MulticastConfig
[enabled=false, multicastGroup=224.2.2.3, multicastPort=54327,
multicastTimeToLive=32, multicastTimeoutSeconds=2,
trustedInterfaces=[]], tcpIpConfig=TcpIpConfig [enabled=true,
connectionTimeoutSeconds=5, members=[192.168.0.1, 192.168.0.2,
192.168.0.3, 192.168.0.4, 192.168.0.5, 192.168.0.6, 192.168.0.7, 192.168.0.8, 192.168.0.9, 192.168.0.10, 192.168.0.11], requiredMember=null], awsConfig=AwsConfig{enabled=false,
region='us-east-1', securityGroupName='null', tagKey='null',
tagValue='null', hostHeader='ec2.amazonaws.com',
connectionTimeoutSeconds=5}} with 1 members.
My non-hazelcast-implementation is using UDP multicasts and works fine. So can a firewall really be the problem?
Update 3, taking pveentjer's answer about checking the network into account
Since I do not have permissions for iptables or to install iperf, I am using com.hazelcast.examples.TestApp
to check whether my network is working, as described in Getting Started With Hazelcast in Chapter 2, Section "Showing Off Straight Away":
I call java -cp hazelcast-3.1.2.jar com.hazelcast.examples.TestApp
on 192.168.0.1 and get the output
...Dec 10, 2013 11:31:21 PM com.hazelcast.instance.DefaultAddressPicker
INFO: Prefer IPv4 stack is true.
Dec 10, 2013 11:31:21 PM com.hazelcast.instance.DefaultAddressPicker
INFO: Picked Address[192.168.0.1]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
Dec 10, 2013 11:31:22 PM com.hazelcast.system
INFO: [192.168.0.1]:5701 [dev] Hazelcast Community Edition 3.1.2 (20131120) starting at Address[192.168.0.1]:5701
Dec 10, 2013 11:31:22 PM com.hazelcast.system
INFO: [192.168.0.1]:5701 [dev] Copyright (C) 2008-2013 Hazelcast.com
Dec 10, 2013 11:31:22 PM com.hazelcast.instance.Node
INFO: [192.168.0.1]:5701 [dev] Creating MulticastJoiner
Dec 10, 2013 11:31:22 PM com.hazelcast.core.LifecycleService
INFO: [192.168.0.1]:5701 [dev] Address[192.168.0.1]:5701 is STARTING
Dec 10, 2013 11:31:24 PM com.hazelcast.cluster.MulticastJoiner
INFO: [192.168.0.1]:5701 [dev]
Members [1] {
Member [192.168.0.1]:5701 this
}
Dec 10, 2013 11:31:24 PM com.hazelcast.core.LifecycleService
INFO: [192.168.0.1]:5701 [dev] Address[192.168.0.1]:5701 is STARTED
I then call java -cp hazelcast-3.1.2.jar com.hazelcast.examples.TestApp
on 192.168.0.2 and get the output
...Dec 10, 2013 9:50:22 PM com.hazelcast.instance.DefaultAddressPicker
INFO: Prefer IPv4 stack is true.
Dec 10, 2013 9:50:22 PM com.hazelcast.instance.DefaultAddressPicker
INFO: Picked Address[192.168.0.2]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
Dec 10, 2013 9:50:23 PM com.hazelcast.system
INFO: [192.168.0.2]:5701 [dev] Hazelcast Community Edition 3.1.2 (20131120) starting at Address[192.168.0.2]:5701
Dec 10, 2013 9:50:23 PM com.hazelcast.system
INFO: [192.168.0.2]:5701 [dev] Copyright (C) 2008-2013 Hazelcast.com
Dec 10, 2013 9:50:23 PM com.hazelcast.instance.Node
INFO: [192.168.0.2]:5701 [dev] Creating MulticastJoiner
Dec 10, 2013 9:50:23 PM com.hazelcast.core.LifecycleService
INFO: [192.168.0.2]:5701 [dev] Address[192.168.0.2]:5701 is STARTING
Dec 10, 2013 9:50:23 PM com.hazelcast.nio.SocketConnector
INFO: [192.168.0.2]:5701 [dev] Connecting to /192.168.0.1:5701, timeout: 0, bind-any: true
Dec 10, 2013 9:50:23 PM com.hazelcast.nio.TcpIpConnectionManager
INFO: [192.168.0.2]:5701 [dev] 38476 accepted socket connection from /192.168.0.1:5701
Dec 10, 2013 9:50:28 PM com.hazelcast.cluster.ClusterService
INFO: [192.168.0.2]:5701 [dev]
Members [2] {
Member [192.168.0.1]:5701
Member [192.168.0.2]:5701 this
}
Dec 10, 2013 9:50:30 PM com.hazelcast.core.LifecycleService
INFO: [192.168.0.2]:5701 [dev] Address[192.168.0.2]:5701 is STARTED
So multicast discovery is generally working on my cluster, right? Is 5701 also the port for discovery? Is 38476
in the last output an ID or a port?
Joining still does not work for my own code with programmatical configuration :(
Update 4, taking pveentjer's answer about using the default configuration into account
The modified TestApp gives the output
joinConfig{multicastConfig=MulticastConfig [enabled=true, multicastGroup=224.2.2.3,
multicastPort=54327, multicastTimeToLive=32, multicastTimeoutSeconds=2,
trustedInterfaces=[]], tcpIpConfig=TcpIpConfig [enabled=false,
connectionTimeoutSeconds=5, members=[], requiredMember=null],
awsConfig=AwsConfig{enabled=false, region='us-east-1', securityGroupName='null',
tagKey='null', tagValue='null', hostHeader='ec2.amazonaws.com', connectionTimeoutSeconds=5}}
and does detect other members after a couple of seconds (after each instance once lists only itself as a member if all are started at the same time), whereas
myProgram gives the output
joined via JoinConfig{multicastConfig=MulticastConfig [enabled=true, multicastGroup=224.2.2.3, multicastPort=54327, multica
stTimeToLive=32, multicastTimeoutSeconds=2, trustedInterfaces=[]], tcpIpConfig=TcpIpConfig [enabled=false, connectionTimeoutSecond
s=5, members=[], requiredMember=null], awsConfig=AwsConfig{enabled=false, region='us-east-1', securityGroupName='null', tagKey='nu
ll', tagValue='null', hostHeader='ec2.amazonaws.com', connectionTimeoutSeconds=5}} with 1 members.
and does not detect members within its runtime of about 1 minute (I am counting the members about every 5 seconds).
BUT if at least one instance of TestApp runs concurrently on the cluster, all TestApp instances and all myProgram instances are detected and my program works fine. In case I start TestApp once and then myProgram twice in parallel, TestApp gives the following output:
java -cp ~/CaseStudy/jtorx-1.10.0-beta8/lib/hazelcast-3.1.2.jar:. TestApp
Dec 12, 2013 12:02:15 PM com.hazelcast.instance.DefaultAddressPicker
INFO: Prefer IPv4 stack is true.
Dec 12, 2013 12:02:15 PM com.hazelcast.instance.DefaultAddressPicker
INFO: Picked Address[192.168.180.240]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
Dec 12, 2013 12:02:15 PM com.hazelcast.system
INFO: [192.168.180.240]:5701 [dev] Hazelcast Community Edition 3.1.2 (20131120) starting at Address[192.168.180.240]:5701
Dec 12, 2013 12:02:15 PM com.hazelcast.system
INFO: [192.168.180.240]:5701 [dev] Copyright (C) 2008-2013 Hazelcast.com
Dec 12, 2013 12:02:15 PM com.hazelcast.instance.Node
INFO: [192.168.180.240]:5701 [dev] Creating MulticastJoiner
Dec 12, 2013 12:02:15 PM com.hazelcast.core.LifecycleService
INFO: [192.168.180.240]:5701 [dev] Address[192.168.180.240]:5701 is STARTING
Dec 12, 2013 12:02:21 PM com.hazelcast.cluster.MulticastJoiner
INFO: [192.168.180.240]:5701 [dev]
Members [1] {
Member [192.168.180.240]:5701 this
}
Dec 12, 2013 12:02:22 PM com.hazelcast.core.LifecycleService
INFO: [192.168.180.240]:5701 [dev] Address[192.168.180.240]:5701 is STARTED
Dec 12, 2013 12:02:22 PM com.hazelcast.management.ManagementCenterService
INFO: [192.168.180.240]:5701 [dev] Hazelcast will connect to Management Center on address: http://localhost:8080/mancenter-3.1.2/
Join: JoinConfig{multicastConfig=MulticastConfig [enabled=true, multicastGroup=224.2.2.3, multicastPort=54327, multicastTimeToLive=32, multicastTimeoutSeconds=2, trustedInterfaces=[]], tcpIpConfig=TcpIpConfig [enabled=false, connectionTimeoutSeconds=5, members=[], requiredMember=null], awsConfig=AwsConfig{enabled=false, region='us-east-1', securityGroupName='null', tagKey='null', tagValue='null', hostHeader='ec2.amazonaws.com', connectionTimeoutSeconds=5}}
Dec 12, 2013 12:02:22 PM com.hazelcast.partition.PartitionService
INFO: [192.168.180.240]:5701 [dev] Initializing cluster partition table first arrangement...
hazelcast[default] > Dec 12, 2013 12:03:27 PM com.hazelcast.nio.SocketAcceptor
INFO: [192.168.180.240]:5701 [dev] Accepting socket connection from /192.168.0.8:38764
Dec 12, 2013 12:03:27 PM com.hazelcast.nio.TcpIpConnectionManager
INFO: [192.168.180.240]:5701 [dev] 5701 accepted socket connection from /192.168.0.8:38764
Dec 12, 2013 12:03:27 PM com.hazelcast.nio.SocketAcceptor
INFO: [192.168.180.240]:5701 [dev] Accepting socket connection from /192.168.0.7:54436
Dec 12, 2013 12:03:27 PM com.hazelcast.nio.TcpIpConnectionManager
INFO: [192.168.180.240]:5701 [dev] 5701 accepted socket connection from /192.168.0.7:54436
Dec 12, 2013 12:03:32 PM com