Eclipse string partition shared optimization mechanism

xiaoxiao2021-03-06  35

original:

http://www.blogcn.com/user8/flier_lu/blog/6018564.html

In the language of Java / C #, based on the language of the reference semantic processing string, as a string exists in an insectible object, if the content is the same, reuse can be achieved by a certain mechanism. Because of such languages, there is no difference between the two memory locations in memory, and point to a string while pointing to a string. Especially for a large number of XML files that use strings, such optimization can reduce the memory occupation of the program, such as the SAX parsing engine standard specifically defines one.

http://xml.org/sax/features/string-interning features are used for string reuse.

String.Intern support is available in the language level, Java / C #. Among them, I can refer to my other article.

"Optimization of string uncapped in CLR"

For Java, it is very similar to that. From the String.Intern method, the current string is key, the object reference is the value, and put it in a global hash table.

Java code:

//

// java / lang / string.java

//

public

Final

Class

String

{

// ...

Public native

String intern

(

);

// Use the JNI function to achieve efficiency

}

//

// hotspot / src / share / vm / prims / jvm.cpp

//

JVM_ENTRY

(jstring, jvm_internstring

(JNIENV * ENV, JString STR

)

)

JVMWrapper

("Jvm_internstring"

);

IF

(str ==

NULL

)

Return

NULL;

OOP STRING = jnihandles :: resolve_non_null

(Str

);

// will quote resolved as internal handle

OOP Result = StringTable :: intern

(String, Check_

0

);

// Take a 1/4 real site ® INTERN operation

Return

(JString)

JNiHandles :: Make_Local

(ENV, RESULT

);

/ / Get reference to the internal handle

JVM_END

//

// hotspot / src / share / vm / memory / symboltable.cpp

//

OOP StringTable :: intern

(OOP STRING, TRAPS

)

{

IF

(String ==

NULL

)

Return

NULL;

Resourcemark RM

(Thread

);

/ / Protect thread resource area

Int length;

Handle h_string

(Thread, String

);

JCHAR * Chars = java_lang_string :: as_unicode_string

(String, Length

);

// Get the actual string content

OOP Result = INTERN

(H_String, Chars, Length, Check_

0

);

// Complete string internal operation

Return Result;

}

OOP StringTable :: intern

(Handle String_or_null, Jchar * Name,

Int Len, Traps

)

{

INT HashValue = hash_string

(Name, Len

);

// First calculate the hash value according to the contents of the string

StringTableBucket * bucket = bucketfor

(HashValue

);

/ / Obtain the target container according to the hash value

OOP STRING = Bucket-> LOOKUP

(Name, Len

);

/ / Then detect if the string already exists

// Found

IF

(String! =

NULL

)

Return string;

// OtherWise, Add to Symbol To Table

Return Basic_Add

(String_or_null, Name, Len, HashValue, Check_

0

);

// put the string into the hash table

}

There is no way to explicit manual clearance for the string in the global string table. Can only be analyzed when the garbage reclaimed thread can be analyzed by the garbage collection thread without using this string, and finally calls the StringTable :: UNLINK method to spread.

Java code:

//

// hotspot / src / share / vm / memory / genmarksweep.cpp

//

Void genmarksweep :: mark_sweep_phase1

(...

)

{

// ...

StringTable :: unlink

(

);

}

//

// hotspot / src / share / vm / memory / symboltable.cpp

//

Void stringtable :: unlink

(

)

{

// readers of the string Table Are Unlocked, So We Should Only Be

// Removing Entries at a safepoint.

Assert

(SafePointSynchronize :: is_at_safepoint

(

"" Must Be at SafePoint "

)

for

(StringTablebucket * bucket = firstbucket

(

BUCKET <= lastbucket

(

BUCKET

)

{

for

(StringTableEntry ** P = bucket-> entry_addr

(

); * p! =

NULL;

)

{

StringTableEntry * entry = * p;

Assert

(entry-> Literal_String

(

)! =

NULL, "Just Checking"

);

IF

(entry-> Literal_String

(

) -> IS_GC_MARKED

(

)

)

{

/ / String object is up to

// is this One of Calls Those Necessary Only for verification? (DLD)

Entry-> OOPS_DO

(& Marksweep :: FOLLOW_ROOT_CLOSURE

);

P = entry-> next_addr

(

);

}

Else

{

// If you are not arrogant, you reclaim it into the memory pool.

* p = entry-> next

(

);

Entry-> set_next

(free_list

);

free_list = entry;

}

}

}

}

Through the above code, we can intuitively understand that String.intern provides global hash-based sharing support for JVM (Sun JDK 1.4.2). Although this implementation is simple, it is possible to share string to maximize; but there is also a shared granularity, the optimization effect cannot be measured, and a large number of strings may result in a reduction in global string performance. To this end, Eclipse discards the JVM level string sharing optimization mechanism, and this problem is alleviated by providing fine grain, fully controllable, measurable string partition sharing optimization mechanism to a certain extent. The Eclipse core istringPoolParticipant interface is explicitly implemented by the user, and submits a string that needs to be shared in its ShareStrings method. Java code:

//

// Org.eclipse.core.Runtime.InstringPoolParticipant

//

public

Interface istringPoolParticipant

{

/ ** * Instructs this participant to share its strings in the provided * pool. * /

public

Void ShareStrings

(StringPool pool)

);

}

For example, the MarkerInfo type implements the IStringPoolPArticipant interface. In its ShareStrings method, you will submit your own string Type and notify its submitted submission.

Java code:

//

// Org.eclipse.core.internal.Resources.MarkerInfo

//

public

Class MarkerInfo

Implements ..., IstringPoolParticipant

{

public

Void ShareStrings

(StringPool Set

)

{

TYPE = SET.

Add

(Type

);

Map Map = attribute;

IF

(Map InstanceOf IstringPoolParticipant

)

(

(IstringPoolPArticipant)

Map

).

ShareStrings

(set

);

}

}

In this way, as long as an object tree selectively implements the ISTRINGPOOLPArticipant interface, all strings that need to be shared will be submitted to a string buffer in one time. If Workspace is such a string shared root entry, the OPEN method will need to perform a string shared cache management object after completing the workspace, and add to the global string buffer partition optimization list.

Java code:

//

// Org.eclipse.core.internal.Resources

//

public

Class Workspace ...

{

Protected Savemanager SaveManager;

Public ISTATUS OPEN

(IProgressMonitor Monitor

)

Throws Coreexception

{

// Open work space

// finally register a new string buffer pool partition

INTERNALPLATFORM.

GetDefault

(

).

AddStringPoolParticipant

(Savemanager, Getroot

(

)

);

Return status.

OK_STATUS;

}

}

For type SaveManager, you need to optimize, you only need to implement the IStringPoolPArticipant interface, and submit yourself to optimize the string with child elements when you are called. Its child elements do not even need to implement the IStringPoolParticipant interface, just pass the submission behavior level first level, such as: Java code:

//

// org.eclipse.core.internal.Resources.savemanager

//

public

Class SaveManager

Implements ..., IstringPoolParticipant

{

Protected ElementTree Lastsnap;

public

Void ShareStrings

(StringPool pool)

)

{

Lastsnap.

ShareStrings

(pool)

);

}

}

//

// org.eclipse.core.internal.watson.ementtree

//

public

Class ElementTree

{

Protected Deltadatature Tree;

public

Void ShareStrings

(StringPool Set

)

{

Tree.

StoreStrings

(set

);

}

}

//

// org.eclipse.core.internal.dtree.delTADATATREE

//

public

Class delTADATATREE

Extends AbstractDataRee

{

Private AbstractDataTreenode Rootnode;

PRIVATE DELTADATATREE PARENT;

public

Void StoreStrings

(StringPool Set

)

{

// Copy Field to Protect Against Concurrent Changes

AbstractDataReenode root = rootnode;

Deltadatatree DAD = Parent;

IF

(root! =

NULL

)

root.

StoreStrings

(set

);

IF

(DAD! =

NULL

)

DAD.

StoreStrings

(set

);

}

}

//

// org.eclipse.core.internal.dtree.abstractDataTReenode

//

public

Abstract

Class AbstractDataTreenode

{

Protected AbstractDataReenode Children

[

];

protected

String name;

public

Void StoreStrings

(StringPool Set

)

{

Name = set.

Add

(Name)

);

// Copy Children Pointer In Case of Concurrent Modification

AbstractDataReenode

[

] nodes = children;

IF

(nodes! =

NULL

)

for

(

INT i = NODES.

Length; --I> =

0;

)

Nodes

[i

].

StoreStrings

(set

);

}

}

All need to optimize strings, will be submitted to a unified string buffer pool via the StringPool.Add method. And the cushioning pool is slightly different from the JVM-level string table, which is just a staging finishing effect when string buffer partition is optimized, which does not exist as the entry referenced by the string. Therefore, it is only simple to pack HashMap, and the additional space can be brought roughly to provide an additional measurement of the optimization effect. Java code:

//

// Org.eclipse.core.Runtime.StringPool

//

public

Final

Class StringPool

{

Private

Int savings;

Private

Final

Hashmap map =

New

Hashmap

(

);

Public StringPool

(

)

{

Super

(

);

}

public

String Add

(

String string

)

{

IF

(String ==

NULL

)

Return string;

Object result = map.

get

(String)

);

IF

(Result! =

NULL

)

{

IF

(Result! = String

)

Savings =

44

2 * string.

Length

(

);

Return

(

String

RESULT;

}

Map.

PUT

String, String

);

Return string;

}

// Get an approval of how much space can save?

public

Int getsavedstringcount

(

)

{

Return Savings;

}

}

However, the estimation value here is inappur, such as the buffer pool, including the string S1, which is submitted in the same content, and the physical location is submitted, and if S2 is submitted multiple times, Evaluation of error results in errors. Of course, if you need to get an exact value, you can reconstruct it. If you trace each string optimization process, accurately optimize metrics, but it is necessary to lose a certain efficiency. After understanding the submission process that needs to optimize the string, and after the string is submitted, we will then look at how the Eclipse core is integrated together. The Workspace.Open method will call the INTERNALPLATMM.AddStringPoolParticipant method, add a string buffer pool partition to the global optimization task queue.

Java code:

//

// org.eclipse.core.internal.Runtime.InternalPlatform

//

public

Final

Class InternalPlatform

{

Private stringPoolJob StringPoolJob;

public

Void AddStringPoolParticipant

(IstringPoolParticipant Participant, Ischedulingrule Rule

)

{

IF

(StringPoolJob ==

NULL

)

StringPoolJob =

New StringPoolJob

(

);

// Singleton mode

StringPoolJob.

AddStringPoolParticipant

(Participant, Rule

);

}

}

//

// Org.eclipse.core.Internal.Runtime.StringPoolJob //

public

Class StringPoolJob

Extends Job

{

Private

Static

Final long initial_delay =

10000;

// FIVE Seconds

Private

Map participants =

COLLECTIONS.

Synchronizedmap

(

New

Hashmap

(

10

)

);

public

Void AddStringPoolParticipant

(IstringPoolParticipant Participant, Ischedulingrule Rule

)

{

Participants.

PUT

(Participant, Rule

);

IF

(Sleep

(

)

)

Wakeup

(Initial_Delay

);

}

public

Void RemoveStringPoolParticipant

(IstringPoolParticipant Participant

)

{

Participants.

Remove

(Participant

);

}

}

This task will be used to share optimization for each registered partition at the time. The StringPoolJob type is the code of the partition task, and its underlying implementation is the task scheduling mechanism through Eclipse. About Eclipse task scheduling, interested friends can refer to Michael Valenta (IBM) on the Job: The Eclipse Jobs API. It is to be understood here that Job is scheduled as an asynchronous background task in Eclipse. When time or resource is ready, it is executed by calling its Job.Run method. It can be said that Job is very similar to a thread, but it is only based on the condition, and it can be optimized by the background thread pool. And the task is scheduled, one side is the task's own scheduling time factor, and on the other hand, the task resource dependency provided by the iSchedulingRule interface. If a task is tradition with the currently running task, it will be hung up until the conflict is relieved. The iSchedulingRule interface itself can be combined by Composite, describe complex task dependencies. In the StringPoolJob.Run method of the specific completion task, the scheduling condition of all string buffers will be merged to complete the actual work in the case where the condition is allowed.

Java code:

//

// org.eclipse.core.internal.Runtime.StringPoolJob

//

public

Class StringPoolJob

Extends Job

{

Private

Static

Final long reschedule_delay =

300000;

// Five Minutes

Protected iStatus Run

(IProgressMonitor Monitor

)

{

// Copy Current Participants to Handle Concurrent Additions and Removals To Map

Map.

Entry

[

] entries =

(

Map.

Entry

[

]

Participants.

Entryset

(

).

Toarray

(

New

Map.

Entry

[

0

]

);

IschedulingRule

[

] rules =

New ischedulingrule [entries.

Length

];

IstringPoolParticipant

[

] torun =

New istringPoolParticipant

[entries.

Length

];

for

(

INT i =

0; I

Length; i

)

{

Torun

[i

] =

(IstringPoolPArticipant)

ENTRIES

[i

].

getKey

(

);

Rules

[i

] =

(Ischedulingrule)

ENTRIES

[i

].

GetValue

(

);

}

// Merge the scheduling conditions of all string buffers

Final ischedulingrule rule = multirule.

Combine

(Rules)

);

/ / Call ShareStrings method to perform optimization in the case of schedule conditions permit

Try

{

Platform.

GetJobManager

(

).

BeginRule

(Rule, Monitor

);

// Block until the schedule condition allows

ShareStrings

(Torun, Monitor

);

}

Finally

{

Platform.

GetJobManager

(

).

Endroule

(Rule

);

}

/ / Re-schedule yourself to make the next optimization

Long ScheduleDelay =

Math.

Max

(Reschedule_delay, Lastduration *

100

);

Schedule

(Scheduledlay

);

Return status.

OK_STATUS;

}

}

StringPoolJob.sharestrings is just a simple traversal of all partitions, calling the IStringPoolPartiPant.ShareStrings method of its root node, and performs the optimization operations described earlier and finally returns the optimization effect of the partition. The buffer pool itself is just as an optimization tool, and it is directly abandoned after completion.

Java code:

Private

Int ShareStrings

(IstringPoolPArticipant)

[

] Torun, iProgressMonitor Monitor

)

{

Final StringPool Pool =

New StringPool

(

);

for

(

INT i =

0; I

Length; i

)

{

IF

.

ISCANCELED

(

)

)

// Whether the operation is canceled

Break;

Final istringPoolParticipant Current = Torun

[i

];

Platform.

Run

(

New isaferunnable

(

)

{

// Security Execution

public

Void HandleException

(

Throwable Exception

)

{

// Exceptions area already logged, so not do do

}

public

Void Run

(

)

{

CURRENT.

ShareStrings

(pool)

);

/ / String reuse optimization

}

}

);

}

Return pool.

Getsavedstringcount

(

);

// Back »赜 赜 û

}

}

转载请注明原文地址:https://www.9cbs.com/read-62318.html

New Post(0)