original:
http://www.blogcn.com/user8/flier_lu/blog/6018564.html
In the language of Java / C #, based on the language of the reference semantic processing string, as a string exists in an insectible object, if the content is the same, reuse can be achieved by a certain mechanism. Because of such languages, there is no difference between the two memory locations in memory, and point to a string while pointing to a string. Especially for a large number of XML files that use strings, such optimization can reduce the memory occupation of the program, such as the SAX parsing engine standard specifically defines one.
http://xml.org/sax/features/string-interning features are used for string reuse.
String.Intern support is available in the language level, Java / C #. Among them, I can refer to my other article.
"Optimization of string uncapped in CLR"
For Java, it is very similar to that. From the String.Intern method, the current string is key, the object reference is the value, and put it in a global hash table.
Java code:
//
// java / lang / string.java
//
public
Final
Class
String
{
// ...
Public native
String intern
(
);
// Use the JNI function to achieve efficiency
}
//
// hotspot / src / share / vm / prims / jvm.cpp
//
JVM_ENTRY
(jstring, jvm_internstring
(JNIENV * ENV, JString STR
)
)
JVMWrapper
("Jvm_internstring"
);
IF
(str ==
NULL
)
Return
NULL;
OOP STRING = jnihandles :: resolve_non_null
(Str
);
// will quote resolved as internal handle
OOP Result = StringTable :: intern
(String, Check_
0
);
// Take a 1/4 real site ® INTERN operation
Return
(JString)
JNiHandles :: Make_Local
(ENV, RESULT
);
/ / Get reference to the internal handle
JVM_END
//
// hotspot / src / share / vm / memory / symboltable.cpp
//
OOP StringTable :: intern
(OOP STRING, TRAPS
)
{
IF
(String ==
NULL
)
Return
NULL;
Resourcemark RM
(Thread
);
/ / Protect thread resource area
Int length;
Handle h_string
(Thread, String
);
JCHAR * Chars = java_lang_string :: as_unicode_string
(String, Length
);
// Get the actual string content
OOP Result = INTERN
(H_String, Chars, Length, Check_
0
);
// Complete string internal operation
Return Result;
}
OOP StringTable :: intern
(Handle String_or_null, Jchar * Name,
Int Len, Traps
)
{
INT HashValue = hash_string
(Name, Len
);
// First calculate the hash value according to the contents of the string
StringTableBucket * bucket = bucketfor
(HashValue
);
/ / Obtain the target container according to the hash value
OOP STRING = Bucket-> LOOKUP
(Name, Len
);
/ / Then detect if the string already exists
// Found
IF
(String! =
NULL
)
Return string;
// OtherWise, Add to Symbol To Table
Return Basic_Add
(String_or_null, Name, Len, HashValue, Check_
0
);
// put the string into the hash table
}
There is no way to explicit manual clearance for the string in the global string table. Can only be analyzed when the garbage reclaimed thread can be analyzed by the garbage collection thread without using this string, and finally calls the StringTable :: UNLINK method to spread.
Java code:
//
// hotspot / src / share / vm / memory / genmarksweep.cpp
//
Void genmarksweep :: mark_sweep_phase1
(...
)
{
// ...
StringTable :: unlink
(
);
}
//
// hotspot / src / share / vm / memory / symboltable.cpp
//
Void stringtable :: unlink
(
)
{
// readers of the string Table Are Unlocked, So We Should Only Be
// Removing Entries at a safepoint.
Assert
(SafePointSynchronize :: is_at_safepoint
(
"" Must Be at SafePoint "
)
for
(StringTablebucket * bucket = firstbucket
(
BUCKET <= lastbucket
(
BUCKET
)
{
for
(StringTableEntry ** P = bucket-> entry_addr
(
); * p! =
NULL;
)
{
StringTableEntry * entry = * p;
Assert
(entry-> Literal_String
(
)! =
NULL, "Just Checking"
);
IF
(entry-> Literal_String
(
) -> IS_GC_MARKED
(
)
)
{
/ / String object is up to
// is this One of Calls Those Necessary Only for verification? (DLD)
Entry-> OOPS_DO
(& Marksweep :: FOLLOW_ROOT_CLOSURE
);
P = entry-> next_addr
(
);
}
Else
{
// If you are not arrogant, you reclaim it into the memory pool.
* p = entry-> next
(
);
Entry-> set_next
(free_list
);
free_list = entry;
}
}
}
}
Through the above code, we can intuitively understand that String.intern provides global hash-based sharing support for JVM (Sun JDK 1.4.2). Although this implementation is simple, it is possible to share string to maximize; but there is also a shared granularity, the optimization effect cannot be measured, and a large number of strings may result in a reduction in global string performance. To this end, Eclipse discards the JVM level string sharing optimization mechanism, and this problem is alleviated by providing fine grain, fully controllable, measurable string partition sharing optimization mechanism to a certain extent. The Eclipse core istringPoolParticipant interface is explicitly implemented by the user, and submits a string that needs to be shared in its ShareStrings method. Java code:
//
// Org.eclipse.core.Runtime.InstringPoolParticipant
//
public
Interface istringPoolParticipant
{
/ ** * Instructs this participant to share its strings in the provided * pool. * /
public
Void ShareStrings
(StringPool pool)
);
}
For example, the MarkerInfo type implements the IStringPoolPArticipant interface. In its ShareStrings method, you will submit your own string Type and notify its submitted submission.
Java code:
//
// Org.eclipse.core.internal.Resources.MarkerInfo
//
public
Class MarkerInfo
Implements ..., IstringPoolParticipant
{
public
Void ShareStrings
(StringPool Set
)
{
TYPE = SET.
Add
(Type
);
Map Map = attribute;
IF
(Map InstanceOf IstringPoolParticipant
)
(
(IstringPoolPArticipant)
Map
).
ShareStrings
(set
);
}
}
In this way, as long as an object tree selectively implements the ISTRINGPOOLPArticipant interface, all strings that need to be shared will be submitted to a string buffer in one time. If Workspace is such a string shared root entry, the OPEN method will need to perform a string shared cache management object after completing the workspace, and add to the global string buffer partition optimization list.
Java code:
//
// Org.eclipse.core.internal.Resources
//
public
Class Workspace ...
{
Protected Savemanager SaveManager;
Public ISTATUS OPEN
(IProgressMonitor Monitor
)
Throws Coreexception
{
// Open work space
// finally register a new string buffer pool partition
INTERNALPLATFORM.
GetDefault
(
).
AddStringPoolParticipant
(Savemanager, Getroot
(
)
);
Return status.
OK_STATUS;
}
}
For type SaveManager, you need to optimize, you only need to implement the IStringPoolPArticipant interface, and submit yourself to optimize the string with child elements when you are called. Its child elements do not even need to implement the IStringPoolParticipant interface, just pass the submission behavior level first level, such as: Java code:
//
// org.eclipse.core.internal.Resources.savemanager
//
public
Class SaveManager
Implements ..., IstringPoolParticipant
{
Protected ElementTree Lastsnap;
public
Void ShareStrings
(StringPool pool)
)
{
Lastsnap.
ShareStrings
(pool)
);
}
}
//
// org.eclipse.core.internal.watson.ementtree
//
public
Class ElementTree
{
Protected Deltadatature Tree;
public
Void ShareStrings
(StringPool Set
)
{
Tree.
StoreStrings
(set
);
}
}
//
// org.eclipse.core.internal.dtree.delTADATATREE
//
public
Class delTADATATREE
Extends AbstractDataRee
{
Private AbstractDataTreenode Rootnode;
PRIVATE DELTADATATREE PARENT;
public
Void StoreStrings
(StringPool Set
)
{
// Copy Field to Protect Against Concurrent Changes
AbstractDataReenode root = rootnode;
Deltadatatree DAD = Parent;
IF
(root! =
NULL
)
root.
StoreStrings
(set
);
IF
(DAD! =
NULL
)
DAD.
StoreStrings
(set
);
}
}
//
// org.eclipse.core.internal.dtree.abstractDataTReenode
//
public
Abstract
Class AbstractDataTreenode
{
Protected AbstractDataReenode Children
[
];
protected
String name;
public
Void StoreStrings
(StringPool Set
)
{
Name = set.
Add
(Name)
);
// Copy Children Pointer In Case of Concurrent Modification
AbstractDataReenode
[
] nodes = children;
IF
(nodes! =
NULL
)
for
(
INT i = NODES.
Length; --I> =
0;
)
Nodes
[i
].
StoreStrings
(set
);
}
}
All need to optimize strings, will be submitted to a unified string buffer pool via the StringPool.Add method. And the cushioning pool is slightly different from the JVM-level string table, which is just a staging finishing effect when string buffer partition is optimized, which does not exist as the entry referenced by the string. Therefore, it is only simple to pack HashMap, and the additional space can be brought roughly to provide an additional measurement of the optimization effect. Java code:
//
// Org.eclipse.core.Runtime.StringPool
//
public
Final
Class StringPool
{
Private
Int savings;
Private
Final
Hashmap map =
New
Hashmap
(
);
Public StringPool
(
)
{
Super
(
);
}
public
String Add
(
String string
)
{
IF
(String ==
NULL
)
Return string;
Object result = map.
get
(String)
);
IF
(Result! =
NULL
)
{
IF
(Result! = String
)
Savings =
44
2 * string.
Length
(
);
Return
(
String
RESULT;
}
Map.
PUT
String, String
);
Return string;
}
// Get an approval of how much space can save?
public
Int getsavedstringcount
(
)
{
Return Savings;
}
}
However, the estimation value here is inappur, such as the buffer pool, including the string S1, which is submitted in the same content, and the physical location is submitted, and if S2 is submitted multiple times, Evaluation of error results in errors. Of course, if you need to get an exact value, you can reconstruct it. If you trace each string optimization process, accurately optimize metrics, but it is necessary to lose a certain efficiency. After understanding the submission process that needs to optimize the string, and after the string is submitted, we will then look at how the Eclipse core is integrated together. The Workspace.Open method will call the INTERNALPLATMM.AddStringPoolParticipant method, add a string buffer pool partition to the global optimization task queue.
Java code:
//
// org.eclipse.core.internal.Runtime.InternalPlatform
//
public
Final
Class InternalPlatform
{
Private stringPoolJob StringPoolJob;
public
Void AddStringPoolParticipant
(IstringPoolParticipant Participant, Ischedulingrule Rule
)
{
IF
(StringPoolJob ==
NULL
)
StringPoolJob =
New StringPoolJob
(
);
// Singleton mode
StringPoolJob.
AddStringPoolParticipant
(Participant, Rule
);
}
}
//
// Org.eclipse.core.Internal.Runtime.StringPoolJob //
public
Class StringPoolJob
Extends Job
{
Private
Static
Final long initial_delay =
10000;
// FIVE Seconds
Private
Map participants =
COLLECTIONS.
Synchronizedmap
(
New
Hashmap
(
10
)
);
public
Void AddStringPoolParticipant
(IstringPoolParticipant Participant, Ischedulingrule Rule
)
{
Participants.
PUT
(Participant, Rule
);
IF
(Sleep
(
)
)
Wakeup
(Initial_Delay
);
}
public
Void RemoveStringPoolParticipant
(IstringPoolParticipant Participant
)
{
Participants.
Remove
(Participant
);
}
}
This task will be used to share optimization for each registered partition at the time. The StringPoolJob type is the code of the partition task, and its underlying implementation is the task scheduling mechanism through Eclipse. About Eclipse task scheduling, interested friends can refer to Michael Valenta (IBM) on the Job: The Eclipse Jobs API. It is to be understood here that Job is scheduled as an asynchronous background task in Eclipse. When time or resource is ready, it is executed by calling its Job.Run method. It can be said that Job is very similar to a thread, but it is only based on the condition, and it can be optimized by the background thread pool. And the task is scheduled, one side is the task's own scheduling time factor, and on the other hand, the task resource dependency provided by the iSchedulingRule interface. If a task is tradition with the currently running task, it will be hung up until the conflict is relieved. The iSchedulingRule interface itself can be combined by Composite, describe complex task dependencies. In the StringPoolJob.Run method of the specific completion task, the scheduling condition of all string buffers will be merged to complete the actual work in the case where the condition is allowed.
Java code:
//
// org.eclipse.core.internal.Runtime.StringPoolJob
//
public
Class StringPoolJob
Extends Job
{
Private
Static
Final long reschedule_delay =
300000;
// Five Minutes
Protected iStatus Run
(IProgressMonitor Monitor
)
{
// Copy Current Participants to Handle Concurrent Additions and Removals To Map
Map.
Entry
[
] entries =
(
Map.
Entry
[
]
Participants.
Entryset
(
).
Toarray
(
New
Map.
Entry
[
0
]
);
IschedulingRule
[
] rules =
New ischedulingrule [entries.
Length
];
IstringPoolParticipant
[
] torun =
New istringPoolParticipant
[entries.
Length
];
for
(
INT i =
0; I Length; i ) { Torun [i ] = (IstringPoolPArticipant) ENTRIES [i ]. getKey ( ); Rules [i ] = (Ischedulingrule) ENTRIES [i ]. GetValue ( ); } // Merge the scheduling conditions of all string buffers Final ischedulingrule rule = multirule. Combine (Rules) ); / / Call ShareStrings method to perform optimization in the case of schedule conditions permit Try { Platform. GetJobManager ( ). BeginRule (Rule, Monitor ); // Block until the schedule condition allows ShareStrings (Torun, Monitor ); } Finally { Platform. GetJobManager ( ). Endroule (Rule ); } / / Re-schedule yourself to make the next optimization Long ScheduleDelay = Math. Max (Reschedule_delay, Lastduration * 100 ); Schedule (Scheduledlay ); Return status. OK_STATUS; } } StringPoolJob.sharestrings is just a simple traversal of all partitions, calling the IStringPoolPartiPant.ShareStrings method of its root node, and performs the optimization operations described earlier and finally returns the optimization effect of the partition. The buffer pool itself is just as an optimization tool, and it is directly abandoned after completion. Java code: Private Int ShareStrings (IstringPoolPArticipant) [ ] Torun, iProgressMonitor Monitor ) { Final StringPool Pool = New StringPool ( ); for ( INT i = 0; I Length; i ) { IF . ISCANCELED ( ) ) // Whether the operation is canceled Break; Final istringPoolParticipant Current = Torun [i ]; Platform. Run ( New isaferunnable ( ) { // Security Execution public Void HandleException ( Throwable Exception ) { // Exceptions area already logged, so not do do } public Void Run ( ) { CURRENT. ShareStrings (pool) ); / / String reuse optimization } } ); } Return pool. Getsavedstringcount ( ); // Back »赜 赜 û } }