From
.
WebSphere MQ serves as an IBM software family's messaging middleware product, with its excellent features and features. WebSphere MQ unique security mechanism, simple and fast programming style, superior extensive stability, scalability, and cross-platform, and strong news communication capabilities, make it in bank, telecommunications, or transportation, government agencies, etc. Industry, have won a high market share. In China, WebSphere MQ also has a wide range of user foundations and a lot of successful cases. It not only has a cross-platform, cross-network characteristics, but also guarantees the transmission of the "Once and Only" transmission of the message with its unique advanced mechanism to do not lose, no recovery. A little important in the many values that WebSphere MQ bring to customers, is its communication awareness and recovery mechanism, especially for the current status of my country, in my country, there are poor network lines in China, and network status is unstable. Status. Because WebSphere MQ is supported while supporting synchronous communication, the message-based communication mode based on message queue storage-forwarding mechanism, the application simply handles the message to WebSphere MQ, and the WebSphere MQ is responsible for the security, reliability, not The application and manual intervention will be used. When the network fails, or when the other host fails, WebSphere MQ can do quality without artificial intervention, automatically detect the network status, and can continue after the network is constant. jobs.
This feature requires the correct configuration of the operating system's TCP / IP parameters and MQ itself for configuration parameters; on the other hand, in the system configuration and management of WebSphere MQ, the management of channels is the most complicated. The most important part, this article will use TCP / IP parameter configuration to better implement communication recovery, and add several effective measures on WebSphere MQ channel management to simply discuss.
1. How to configure the operating system TCP / IP parameters and MQ to achieve break network resumes and fault recovery
As a message transmission product, WebSphere MQ itself is the architecture on TCP / IP, so it has a close relationship with the TCP / IP characteristics of the operating system or network. Many cases we have to modify the YCP / IP of the operating system. Parameters, let it better serve the WebSphere MQ service, thus more completely playing the powerful function of WebSphere MQ, and is now only a simple discussion on TCP / IP parameter settings, and is described.
When we want to communicate between the two queue managers of WebSphere MQ, the channel of WebSphere MQ is unidirectional, we must establish two channels, such as the establishment of two types, send (Sender) Type and Receiver type channel to implement communication between two WebSphere MQ Server, the WebSphere MQ of the channel uses the MCA (Message Channel Agent) to manage and monitor the start and stop of the channel, and to channel Message Sequence Number, etc. of the two ends, etc., to ensure the transmission of "overce and overce" on the message. For WebSphere MQ, in its system configuration configuration file mqs.ini file, in the MQS.ini file, it contains log size, channel properties, and related parameters when working with databases through XA standard and database. Setting, in addition to this, a section is used to control information about TCP / IP features, namely:
TCP:
Keepalive = YES or
Keepalive = NO
Its role is: When setting Keepalive = YES, the TCP / IP parameter setting indicating the operating system takes effect on WebSphere MQ. Since the MCA of the WebSphere MQ receives the passport of the communication, it has been waiting for the message from the sender, so it doesn't know when the sender will stop sending the message, and I don't know when the network has failed, when is the sender? Will be from the working state to stop. At this time, the network connection is broken, and the sender channel status will be changed from the Running state to the RETRYING state. The sender will try to re-establish the network connection. At this time, the receiver channel has not stopped, still in one The state of false "running", the corresponding thing we will get an error message of "Channel IS in Use", causing the sender to restart but restart. The reason for this is: When the sender MCA start channel is not disconnected, the network fault is displayed, and the TCP / IP's socket connection is destroyed. When the stop channel is sent and restarted, it needs to be established. The new Socket connection, and the recipient still stays on the original Receive call, and its socket feature is inconsistent with the new Socket feature of the sender, so the new Socket connection has failed.
We can use TCP / IP characteristics to overcome this, and better achieve breaks. Typically, the default setting of the TCP / IP parameters of the operating system is 2 hours (common operating system platforms such as Windows 2000 / NT, AIX, HP-UX, Sun Solaris, Linux, etc., the default settings are 2 Hours) The time to send the Keepalive probe package is 2 hours, so it takes 2 hours of time that it will know that the network connection has been disconnected, which is undoubtedly unacceptable. In this case, we can improve the response speed of TCP / IP by configuring the TCP / IP Keepalive parameter, so that the WebSphere MQ can quickly disconnect the channel connection to restart the channel when network failure is realized. . Only in this way, the channel of the MQ can be changed from the Running state to the RETRYING state, at the same time, the Channel of the MQ can become the NOT FOUND state by the Running state, so that the channel can be re-recovered. Re-establish the connection and restore the Running status.
To achieve the above functions, we need to do the following works:
1) Modify the WebSphere MQ system profile mqs.ini, add as follows:
TCP:
Keepalive = YES
The purpose is to enable the system's TCP / IP settings to take effect on WebSphere MQ.
2) Modify the TCP / IP parameters of the operating system;
On different systems, modifying TCP / IP parameters is slightly different, and now only Windows 2000 / NT, RISC6000, and HP are simply description.
On the Windows NT platform, we use regedit to modify the system registry, modify the following three parameters under HKEY_LOCAL_MACHINE / CURRENTCONTROLSET / Services / Tcpip / parameters:
KeepaliveInterval, set its value of 1000
KeepaliveTime sets its value of 300000 (unit in milliseconds, 300000 represents 5 minutes)
TCPMAXDATARETRANSSIONS, set its value of 5
On the RISC6000 platform, use the no command to modify the following parameters: TCP_KeepIdle keep the TCP / IP connection time, the unit is 0.5 seconds, the default is 14,400, ie two hours, we can set it 5 minutes;
TCP_keepinittcp Connect the initial Timeout value, unit is 0.5 seconds, the default is 150, we can set it to 50;
TCP_KeepintVL connection interval, unit is 0.5 seconds, the default is 150, we can set it to 50;
We can also modify the /etc/rc.net file.
/ usr / sbin / no -o tcp_keepidle = 240
/ usr / sbin / no -o tcp_keepinit = 50
/ usr / sbin / no -o tcp_keepintvl = 50
Note: Use the command line to modify directly. After the machine is restarted, it will fail; modify the RC.NET file, you can do permanently take effect.
On the HP platform,
For HP-UNIX V10.20 and its previous versions, use the / usr / control / bin nettune command to modify the relevant parameters;
For HP-UNIX V10.30 and its above, with / usr / bin / ndd command to modify the relevant parameters.
On the Sun Solaris platform,
Modify the parameters with the NDD -SET / DEV / TCPTCP_KEEPAlive_Interval nnn command, TCP_keepalive_interval is milliseconds, the default is 7200000 milliseconds, ie 2 hours.
On the SCO OpenServer platform,
TCP_keepalive and tcp_keepidle are the same, and its original default is 7200 seconds, can be set to 600 seconds. TCP_Keepintvl Its original default is 75 seconds, which can be set to 15 seconds. All are in "seconds".
Run the command ifconfig command to modify:
/ etc / inconfig tcp_keepidle
/ etc / inconfig tcp_keepintvl
One thing to note is that the Keepalive parameter at both ends should be harmonized. If the keypAlive value of the sender is smaller than the KeePalive value of the receiving end, the channel of the transmit-end is still stopped after the network fails.
2, AdoptneWMCA usage skills
In the 5.2 version of WebSphere MQ, new parameters running in a control channel are added, both AdoptneWMCA, which can be modified by modifying the Channels section of the QM.ini file, such as:
Channels:
AdoptneWMCA = All
It can be set to: NO, SVR, SNDR, RCVR, ClusRcvr, All, FastPath equivalents.
When the MQ receives a request to start the channel, it is found that the process of AMQCRSTA corresponding to the channel already exists, at this time, the process must be stopped first, then the channel can start. The role of AdoptneWMCA is to stop this process and launch a new process for the new channel startup request.
Similar to the method functionality of the modified TCP / IP parameter indicated earlier, the receiving end channel in the RUNNING state can be terminated, so that the channel start operation or request is successful.
If the AdoptneWMCA properties are specified for a channel, the new channel is started due to "Channel Is Already Running", it can: 1) The channel stop before the new channel notification
2) If the old channel does not accept the stop request in the time interval of AdoptneWMCATIMEOUT, the corresponding process (or thread) is dropped by Kill
3) If the old channel is not terminated by step 2, the MQ terminates the channel when the second AdoptneWMcAtimeout time interval is reached while generating "Channel in Use" errors.
3, use skills for disconnect interval
Disconnect Interval is a parameter with the channel of the send and service type, which is: In its setup time interval, if the transmission queue is not available on the channel, the channel will be stopped. After setting the disconnect interval parameter, the channel will be started normally when the sender restarts.
Disconnect Interval's value affects the performance of the channel. If the value of Disconnect Interval is set very small, it will cause frequent boots of the channel; Thereby affect the performance of the system. Therefore, we can effectively improve the performance of channels using the value of disconnect interval.
4, HEART BEAT Interval Tips
This is the parameter of Heart Beat Interval (for WebSphere MQ for AIX, HP-UX, OS / 2, Sun Solaris, Windows NT / 2000 V5.1). Its role is: In the time interval specified by Heart Beat Interval, if there is no message to arrive on the transmission queue, the sender MCA will send a heartbeat signal to the receiver MCA, according to the receiver channel to stop the opportunity, In this case, it doesn't have to wait for Disconnect Interval timeout, it will stop the channel. At the same time, it will release the memory space to store large messages and turn off the recipient's queue.
In order to make the two parameters of HeartBeat Interval and Disconnect Interval more effectively, in general, the Heart Beat Interval setting value needs to be less than the Disconnect Interval setting value.
Also, if we use the transport protocol is TCP / IP, we can also use the SO_KEEPALIVE parameters for setting TCP / IP to implement this feature. After setting the so_keepalive parameter, set the time interval, the TCP / IP itself will periodically detect the state of the other end connection, if the other party is disconnected, the channel will be stopped. Here, TCP / IP time interval should also be less than the value of Disconnect Interval of the WebSphere MQ channel.
5, SHORTRETRY and Longretry
In channel properties such as send types, four parameters are related to communication recovery and channel connections, they are: Shortrty, ShortTMR, longrty, longtmr, their defaults: 10, 60, 99999999, 1200, representing the number of short trial time intervals and number of long trial time intervals, their role and meaning lies in that when the channel is changed from Running to the retrying state, the channel is re-channel Attempt to connect, and take a short trial, after the short test is completed, then enter the long trial. When designing these four parameters, pay attention to the following two points:
1) To ensure a short trial long retry time> fault recovery time
For example: Suppose you estimate that your system failure recovery time is 1 hour, you have to set the shorttmr * (time of shortrty) longtmr * (Time of Shortrty)> 2 Hours can ensure that the channel is still automatically automatically after failure recovery. Try to reconnect.
2) The retry interval will affect the efficiency of automatic recovery
For example, if you set the short trial time to 10 minutes, the long trial time is performed for 1 hour, and the network has been restored after 15 minutes, but at this time, due to the passage has entered a long retrieval stage. It will be within 1 hour to run through the normal operation of the channel through the long retest. Instead, it is not necessary to set the retry interval too short, and the moderate setting should be performed as needed and the user's actual situation.
6, several ways to stop the channel
For the WebSphere MQ system management task, it is important part of the operation to manage and monitor channels. Under normal circumstances, the message channel should remain continuously between the two queues managers, by the WebSphere MQ system administrator or when the Disconnect Interval timeout channel is normal. The sender's stop channel is very effective, but the intervention of the system administrator is required. In the channel recipient, the situation is more complicated, which is because the recipient MCA is in a passive position, which is just the information on the channel, and cannot stop the channel from the receiver normally.
For channel operation, I think it can be controlled according to the following three cases.
If you want the channel to continue to run, we can stop at the sender's control channel. When the channel is interrupted, the system administrator needs to be restarted.
If you want to keep the channel run only when there is a message sent on the channel, we can set the channel's Disconnect Interval property.
In WebSphere MQ for AIX, HP-UX, OS / 2, Sun Solaris, Windows NT / 2000 V5.1, we can implement by setting the channel's Heart Beat Interval property, it can send a party to the channel reception direction Send a heartbeat signal to detect the status of the sender and stop the channel.
In summary, the passage is the essence of WebSphere MQ products. It is a technical guarantee for some important functions of MQ. The scientific or not, directly affects the function and advantage of MQ function and advantage, and this article is only listed. Among them, some of these techniques and methods are shared with readers. In WebSphere MQ configuration and system management, we can design WebSphere MQ message channels and control the operation of the MQ channel by flexible configuring the TCP / IP parameters of the operating system and the relevant properties of the channel. Powerful features!