What is a Java memory model, how is it destroyed?
Level: Advanced
Brian gaetz
(Brian@quiotix.com) Chief Consultant, Quiotix Company March 2004
The JSR 133 that is active for nearly three years has recently released the public recommendations on how to fix the Java Memory Model, JMM. There are several serious defects in the original JMM, which leads to some difficult conceptual semantics, which are considered very simple, such as
Volatile,
Final and
SYNCHRONIZED. In this issue
In Java Theory and Practice, Brian GOetz showed how to strengthen
Volatile and
Final semantics to fix JMM. These changes have been integrated in JDK 1.4; others will be included in JDK 1.5. You can share your views on this article in the forum of this article (you can also click on the bottom or top of the article).
Discuss the button to access the forum).
The Java platform integrates threads and multiprocessing technology into the language, which is much more integrated than most programming languages before. The language is ambitious and pioneering, which is not surprising, which is not surprising, which is not surprising, this problem is slightly difficult than the original idea of Java architect designers. Many underlying confusion about synchronous and thread safety is some difficult to intuitive subtle differences in Java Memory Models (JMM), which is initially specified in Chapter 17 of Java Language Specification and is re-specified by JSR 133.
For example, not all multiprocessor systems exhibit cache coherency; if there is a processor having an updated variable value in its cache, but has not been deposited in the main memory, so The processor may not see this updated value. In the case where the cache lacks consistency, two different processors can see two different values at the same location in memory. This sounds unlikely, but this is deliberate - this is a method of obtaining higher performance and scalability - but this has increased the burden on the developer and compiler to write code to solve these problems. .
What is a memory model, why do I need a memory model? The memory model describes the relationship between the variables (instance domains, static domains, and array elements) in the program, and stores variables in the actual computer system to memory and low-level details such as removing variables from memory. Objects are finally stored in memory, but compilers, runtime, processors, or caches can store or remove variables in the specified memory location of the variable. For example, the compiler may choose to store it into a register in order to optimize a looping index, or the cache will be delayed to a more suitable time, in which a new variable value is stored in the main memory. All of these optimizations are to help achieve higher performance, usually this is transparent for users, but these complicated things may sometimes be fully displayed for multi-processing systems.
JMM allows compilers and caches to have important privileges with data in the order between processor-specific cache (or registers) and main memory, unless the programmer has explicitly request some visibility assurances using Synchronized or Final. This means that in the absence of synchronization, the memory operation occurs in different order from a different thread angle.
Correspondingly, there is no displayed memory model like C and C - but C language programs inherit the memory model of executing program processors (although a compiler of a given architecture may know about the underlying processor. Some of the memory models, and some responsibilities that maintain consistency also fall on the header of the compiler). This means that the concurrent C language program can be in one, and cannot operate correctly on the other, the processor architecture. Although JMM will be a bit confusion, this has a big benefit - the program that is properly synchronized according to JMM can run correctly on any platform that supports Java. The shortcomings of the original JMM although JMM specified in Java Language Specification Chapter 17 is a ambitious attempt, it tries to define a consistent, cross-platform memory model, but it has some subtle and important shortcomings. Synchronized and volatile are very confused, so that many developers sometimes choose to ignore these rules, because the code writes the correct synchronization under the old storage model is very difficult.
Old JMM allows some strange and chaotic things, such as the final field does not seem to set the value set in the constructor (which makes imaginable objects are not unstrenomed) and memory operations reorder unexpected results. . This also prevents some other effective compiler optimization. If you read any article about Double-check locking problems (see Resources), you will remember how messages reordered by memory operations, and when you don't have correct sync (or no active When trying to avoid synchronization), how small but serious problems will hide in your code. Worse, many programs that do not have correct synchronization seem to work well in some cases, such as under a slight load, on a single processor system, or more than JMM's stronger memory models Processor.
"Reorder" This term is used to describe several types of real and significant reordering of memory operations:
When the compiler does not change the semantics of the program, it can be an optimized it can be reordered some instructions. In some cases, the processor can be allowed to perform some operations in an upside down order. The variable is typically allowed to store the variable into the main memory in the order of writing variables when writing variables.
From another thread perspective, any of these conditions will initiate some operations to occur at different from the order specified - and ignore the reordering source code, the memory model considers that all of these conditions are equivalent.
The Target JSR 133 of JSR 133 is authorized to repair JMM, it has several goals:
Keep existing security assurance, including type security. Offer without hybrid safety (Out-of-air safty). This means that the variable value is not "unmanned" - so for a thread, to observe a variable having a variable value x, a thread must have written the variable X before that variables. The semantics of "correct synchronization" program should be as simple as possible. In this way, "correct synchronization" should be officially defined (both definitions should be consistent with each other). Programmers should have confidence to create multi-threaded programs. Of course, we have no magic makes it easy to write concurrency programs, but our goal is to reduce the burden of all the details of memory models to reduce programmers. High performance JVM implementations across a large-scale popular hardware architecture should be possible. Modern processors have a lot of differences on their memory models; JMM should be able to be able to actually as much as possible, without sacrificing performance. Provide a synchronous habit (Idiom) to allow us to release an object and make it visible without synchronization. This is a new security guarantee called initialization safety. There should be only minimum impact on existing code. It is worth noting that there is a vulnerability technology (such as double inspection lock) still has a vulnerability under the new memory model, and "repair" double inspection lock technology is not a target effort to the new memory model. (However, Volatile's new semantics allows for one of the two-inspection lock optionally to work correctly, although we do not encourage this technology.)
Three years from JSR 133 Process, people discovered that these issues were more likely to be much better than any questions they thought. This is the price as a pioneer! The final formal semantic is much more complicated than the original expectations, in fact it adopts a completely different form of originally expected, but informal semantics is clear and intuitive, will be summarized in Part 2 of this article. .
Synchronous and visibility Most programmers know that Synchronized keyword enforces a mutex (mutual exclusive), which prevents multiple threads from entering a synchronous sentence block protected by a given monitor. But there is still another aspect of synchronization: as specified by JMM, it enforces some memory visibility rules. It ensures that the cache is updated when there is a synchronization block, and the cache is invalid when entering a synchronization block. Therefore, during a synchronous block protected by a given monitor, the value written in a thread is visible for the thread of the remaining synchronous blocks protected by the same monitor by the same monitor. It also ensures that the compiler does not move the instruction from the inside of a synchronization block (although it moves the instruction from the outside of the synchronization block in some cases to the interior). JMM does not do this in the absence of synchronization - this is why you must use synchronous (or its compatriots, volatility) as long as multiple threads access the same variable.
Question 1: The most amazing disadvantage of non-variable objects is not unstrenomed JMM is that non-variable objects seem to change their values (this object's invariance is intended to be guaranteed by using the final keyword). (Public Service Reminder: All fields of an object do not necessarily make this object informaries - all fields must also be an original type or a reference to the non-variable object (such as String) It is considered not to be synchronized. However, there is a potential latency because there is a potential delay in the transfer of memory from a thread to another thread, there is a potential condition that allows a thread to first see a value of an unambiguous object, a period of time. It is also seen a different value.
How did this happen? Considering the implementation of String in Sun 1.4 JDK, there are basically three important decisive fields: the offset of the character array starting to the character array, the length, and the description of the character string. String is implemented in this way, not only the character array, so the character array can be shared between multiple String and StringBuffer objects, without having to copy text to a new array every time you create a string. . For example, string.substring () creates a new string that can share the same-character array with the original String, and the two strings are only different in length and offset. Suppose you do the following code:
String S1 = "/ usr / tmp";
String s2 = s1.substring (4); // Contains "/ TMP"
String S2 will have a length and offset of size 4, but it will share the same character array with S1 containing "/ usr / TMP". Before the String constructor runs, the constructor of Object will initialize all fields with their default values, including decisive lengths and offset fields. When the String constructor is running, the string length and the offset are set to the required value. However, under the old memory model, in the absence of synchronization, it is possible that another thread will temporarily see that the offset field has initial default 0, and then seeing the correct value 4. The result is that the value of S2 becomes "/ usr" into "/ TMP". This is not what we want, and it is impossible to all JVM or platforms, but the old memory model specification allows you to do.
Question 2: Reorder Volatility and Non-Violent Storage Another major area is related to the reordering of the memory operation of the Volatile field. The existing JMM in this area has caused some very confusion results. The existing JMM indicates that vulnerability read and write is to deal with direct and main memory, which avoids storage of values to registers or bypassing processor specific caches. This allows multiple threads to generally see the latest values for a given variable. However, the result is that this volatile definition is not useful as it is imagined, and it has led to major confusion in the actual significance of the Volatile.
In order to provide better performance, compilers, runtime, and cache are often allowed to reorder the normal memory operations, as long as the currently executed thread is distinguished. (This is the so-called thread seems to be a semantic semantic Semantics.) However, if the volatile read and write are completely cross-threaded, the compiler or cache cannot be in each other. Read and write between reluctance of volatility. Unfortunately, JMM allows volatile read and write to be reordered by reference to common variables, which means that we cannot use volatile flags as an operation completed. Consider the following code, it is intended to assume that the volatile field initialized is used to indicate that the initialization has been completed.
Listing 1. Use a volatile field as a "guard" variable
Map configoptions;
CHAR [] configText;
Volatile Boolean Initialized = False;
.
// in Thread A
CONFIGOPTION = new hashmap ();
Configtext = readconfigfile (filename);
ProcessConfigOptions (ConfigText, ConfigOptions);
Initialized = True;
.
// in Thread B
While (! initialized)
Sleep ();
// Use configoptions The idea here is that the use of volatile variables initialized as guards to indicate that a set of other operations have been completed. This is a good idea, but it can't work under the old JMM, because the old JMM allows non-volatile writing (such as writing the configOptions field, and writing by the configoptions reference map) and volatile) Sexual writing is reordered together, so another thread may see initialized as true, but there is no consistent or current view for the ConfigOptions field or the object it references. The old language of the Volatile is only promised to see the visibility of the variables written, not to commit other variables. Although this method is more likely to be effectively implemented, the result is that there is no use only.
Conclusion As specified in Chapter 17 of Java Language Specification Chapter 17, JMM has some serious disadvantages, that is, some non-normal or unqualified things that seem to be reasonable. If it is too difficult to correctly write concurrent classes, then we can say that many concurrent classes cannot work as expected, and this is a shortcoming in the platform. Fortunately, we can create a more consistent memory model that is more consistent with most developers without any code that is correct and synchronized under the old memory model, and this has been done by JSR 133 Process. Next month, we will introduce the details of the new memory model (which most of its function is integrated into the 1.4 JDK).
Reference
Participate in the discussion forum of this article. (You can also access the forum by clicking on the bottom of the article or the top of the top.) Read the full Java Theory and Practice Series written by Bill Pugh. Bill Pugh first discovered many of the Java memory model, he safeguards a Java memory model page. The problem of the old memory model and the summary of the semantics of the new memory model can be found in JSR 133 FAQ. Read more about double inspection lock issues and why the obvious attempt to repair it does not work. Read more about why you don't want an object to reference Java theory and practice: security constructors. The JSR 133 responsible for revising JMM is summed up by Java Community Process. JSR 133 has recently released its public review specification. If you want to see how this specification is developed, browse the JMM mailing list archive. The Concurrent Programming in Java in Doug Lea is a professional book that is discussed with a subtle problem written with Java multithreaded programs. Synchronization and the Java Memory Model summarizes the true meaning of synchronization. Chapter 17 of The Java Language Specification, introduces some of the horrific details of the original Java memory model. Hundreds of reference materials related to Java technology can be found in the developerWorks Java technology area. Access the developer bookstore Get a comprehensive technical book list, including hundreds of Java-related books.
About the author Brian Goetz is a software consultant. In the past 15 years, he has always been a professional software developer. He is the chief consultant of Quiotix and serves several JCP expert groups. Quiotix is a software development and consulting company in Los Altos, Gani Furia. Please consult the brian's publishing and the publishing article in the popular industry publication.