Java Theory and Practice: Status Copy of the Web Layer

xiaoxiao2021-03-06  103

Most web applications with certain importance require maintaining a certain session state, such as the content of the user's shopping cart. How to manage and replication in the cluster server application has a significant impact on the scalability of the application. Many J2SE and J2EE applications store status in httpsession provided by the Servlet API. This month, columnist BRIAN GoETZ analyzes some options for status replication and how to use HttpSession most effectively to provide good scalability and performance. Share your point of view with the author and other readers in this article. (You can click on the top of the article or the discussion forum.)

Regardless of the J2EE or J2SE server applications, it is possible to use Java servlet in some way - may be directly passing a representation like JSP technology, Velocity or WebMacro, may also pass a servlet-based web service. Implement, such as AXIS or GLUE. One of the most important features provided by the Servlet API is Session Management - Certification, failure, and maintenance of user status via HttpSession interface.

Session Status Almost every web application has some session state, which may be as simple as you have logged in, or you may be a more detailed history of your session, such as the content of the shopping cart, former query results Caching or 20 pages dynamic questionnaire table's complete response history. Because the HTTP protocol itself is stateless, it is necessary to store the session status to somewhere and associated with the browse session, so that the next time the same web application is requested, it can be easily obtained. Fortunately, J2EE provides several ways to manage session state - status can be stored in the data layer, stored in the web layer with the HTTPSession interface with the servlet API, stored in the Enterprise JavaBeans (EJB) layer, even The cookie or hidden form field stores the status in the client layer. Unfortunately, improper management of session will bring serious performance issues.

This method is usually better than other methods if the application stores user status in httpsession. There is a lot of security risks in the client with HTTP cookie or hidden form single fields - it exposes part of the internal content of the application to the non-trusted client. (A early e-commerce website stores shopping carts (including price) in the hidden form field, so it can be easily illegally utilized, allowing any users who know HTML and HTTP to purchase any items at $ 0.01. 噢) In addition, Using cookie or hiding a form field is confusing, it is easy to make mistakes, and fragile (if the user is forbidden to use cookies in the browser, the cookie-based approach cannot work at all.).

Other methods of storing server-side states in the J2EE application are to use stateful session beans, or store session status in the database. Although there is a greater flexibility in session state management, it is still beneficial to store the session state in the web layer in a possible situation. If the business object is stateless, then you can usually add more web servers to extend the application without adding more web servers and more EJB containers, which is generally low and easy to complete. Another advantage of using HTTPSession stores session status is the easy way to notify when the Servlet API provides a session failure. The cost of storing session status in the database may be unbearable.

The Servlet specification does not require a servlet container to make some type of session replication or persistence, but it recommends that the status replication is an important part of the Raison D'ETRE, and it raises some containers as session replication. Some Claim. Session replication can provide a lot of benefits - load balance, scalability, fault tolerance and high availability. Accordingly, most Servlet containers support some form of HTTPSession copy, but the mechanism, configuration, and time of replication are determined by the implementation. HttpSession API simply said that HTTPSession interface supports several methods, servlets, JSP pages, or other representation layer components can use these methods to request maintenance session information across multiple HTTPs. Sessions are bound to specific users, but share with all servlets in the web application - not specific to a servlet. A useful way to consider a session is that the session is like a MAP that stores an object during a session - can store the session properties with SetAttribute, and extract them with GetAttribute. The HTTPSession interface also includes session survival cycle methods, such as invalidate () (which notifies the container to discard the session). Listing 1 shows the most common elements of the HTTPSession interface:

Listing 1. HTTPSESSION API

Public interface httpsession {

Object GetAttribute (String S);

ENUMERATION GETATTRIBUTENAMES ();

Void SetaTRibute (String S, Object O);

Void Removettribute (String S);

Boolean isnew ();

Void invalidate ();

Void SetMaxINactiveInterval (INT I);

Int getMaxinactiveInterval ();

...

}

In theory, the session state can be fully replicated across the cluster, so all nodes in such a set can serve any request, a simple load balancer can transmit requests in polling, avoiding a faulty host. However, this close copy has a high performance cost, and it is difficult to achieve, and when the cluster is close to a certain size, there will be scalable problems.

A more common way is to combine load balancing and connory-load balancers to associate sessions with connections and send sessions to the same server. There are many hardware and software load balancer supporting this feature, which means that only the primary connection host and session need to fail to transfer to another server, only access to copy session information.

Replication replication provides some possible benefits, including availability, fault-tolerant, and scalability. In addition, there are a large number of session replication methods available: The choice of method depends on the size of the application cluster, the replication facility supported by the replication, and the replication facility supported by the servlet container. Copying performance costs, including CPU cycles (serialized objects stored in sessions), network bandwidth (broadcast updates), and costs written to disk or databases in a disk-based scheme.

Almost all servlet containers are HttpSession copying by storing serialized objects in HttpSession, so if you create a distributed application, you should ensure that you only place a serialized object in a session. (Some container objects EJB references, transaction context, and other non-sequentially-sequentiated J2EE object types have special processes.)

JDBC-based replication method is the serialization session content and write it to the database. This method is quite intuitive, and the advantage is that not only the session can fail to move to other hosts, but even if the entire cluster fails, session data can be saved. The disadvantage of database-based replication is that performance cost - database transactions are expensive. Although it can be well retracted in the web layer, it may generate scalable problems in the data layer - if the cluster grows large to a certain extent, the expansion of the data layer is difficult to accommodate session data or cannot be accepted. File-based replication based file replication is similar to the use of data stock sequencing sessions, just use shared file servers instead of databases to store session data. The cost of this approach is generally lower than the cost of using the database (hardware cost, software license and calculation overhead), and the cost is reliability (database can provide a more powerful persistence guarantee than file system).

Replication of memory-based replication is a copy of session data sharing serialized with one or more other servers in the cluster. Copy all sessions to all hosts provides maximum availability, and load balance is easier, but because the memory and network bandwidth of each node consumed by the message, the scale of the cluster will eventually limit the scale. Some application servers support memory-based replications with the Partner node, where each session exists on the primary server and a (or more) backup server. This scheme is better than the scalability of all sessions to all servers, but it is necessary to complicate the load balancing task when you need to transfer session failures to another server, because it must be found in any one (several Taiwan) The server has this session.

Time Considering Decide how to store replication session data, what is the problem of copying data. The most reliable but most expensive method is to copy it each time the data changes (such as the end of the servlet call). Not too expensive, but there is a way to lose some data in the fault is to copy data every time than n second.

What is related to the time problem is that the copying of the entire session is also only try to change the properties changed in the copy session (it will be much less). These all need to take care of reliability and performance. Servlet developers should recognize that the session status may become "outdated" in the failover, and the copy before requests should be prepared to process is not the latest session content. (For example, if a session attribute is generated in step 3, the user requests the failover to a system with two requests before step 4, then the servlet code of step 4 should be Prepare this property in the session and take the corresponding action - such as redirection, not to identify it, and throw a nullpointerException when it can't find it.)

The HTTPSession copy option for the container supports the servlet container and how to configure these options. The replication options provided by IBM WebSphere® provide up to replication in memory or database-based replication, and propagate all session snapshots (JBoss 3.2 or later versions) at the end of the servlet or time-based replication time. Attributes and other choices. Memory-based replication based on JMS release - subscription, it can be copied to all clones, a "partner" replica or a dedicated replication server.

WebLogic also provides a set of options, including memory (using a partner replica), file-based or database-based. When JBoss is used with Tomcat or Jetty Servlet container, the memory-based replication can be selected, and the end of the servlet or time-based replication time, and the snapshot option (after jboss 3.2 or later) is only replicated to change the attribute. Tomcat 5.0 provides memory-based replication for all cluster nodes. In addition, through items like WADI, session replication can be added to a servlet container like Tomcat or Jetty with a servlet filtering mechanism. Improved distributed web applications performance No matter what mechanism to use, you can use several ways to improve the performance and scalability of web applications. First, remember that in order to obtain the benefits of session replication, you need to mark the web application as Distributable in the deployment descriptor, and ensure that all content in the session is sequentially.

Keeping the session is minimized because the copy session increases costs with the object graph in the session, so it should be placed as little as possible in the session as much as possible. This will reduce the sequential overhead of replication, network bandwidth requirements and disk requirements. In particular, the shared object is stored in a session is generally not a good idea because they need to be copied into each session they belong.

Do not bypass SetAttribute When changing the properties of the session, you have to know that even if the servlet container just tries to make the minimum update (only the changed property), if the setttribute is not called, the container may not not notice the changed property. (Imagine there is a vector in the session, indicating the item in the shopping cart - If you call getAttribute () Get the vector, then add some content to it, and the container may not be aware that the Vector has changed.)

Use refined session properties to support minimal update containers, can reduce the cost of session replication by placing multiple refinement objects instead of a large block in session. Thus, changes in the rapidly changing data do not force the container to serialize and propagate data slowly changed.

After completion, make it failure. If you know the user's use of the session (eg, the user selects logging into login), make sure to call httpsession.invalidate (). Otherwise, the session will last until it expires, which will consume memory and may be long (depending on the session timeout). Many servlet containers have a limit on the number of memory that can be used across all sessions. When this limit is reached, the first session is first used and written to the disk. If you know that the user is using a session, you can make the container no longer process it and make it invalid.

Keeping the session If you have a big item in the session, and only in part of the session, you should delete them when you no longer need it. Delete them will reduce the cost of session replication. (This approach is similar to using explicit nulling to help the garbage collector, the old reader knows that I usually do not recommend this, but in this case, because of the copy, keep the garbage in the session is much higher, so It is worthy of this way to help the container.)

转载请注明原文地址:https://www.9cbs.com/read-99443.html

New Post(0)