Flower 1K memory RandomaccessFile class for efficient IO

xiaoxiao2021-03-06  45

Flower 1K memory RandomaccessFile class for efficient I / O

English original

content:

Before the improvement, you can do a basic test: a 12-way file according to 1.3, the present, the current study, the RandomaccessFile class, plus buffer read mechanisms to optimize BufferedrandomaccessFile requires a perfect place with JDK1.4 new class MappedBytebuffer randomaccessFile comparison? Future Consideration Reference About the author

There is also a Java area:

Teaching tools and product code and components all articles practical skills

By expanding the RandomaccessFile category Buffer improve I / O performance Cui Zhixiang (Bladeinco@citiz.net) Shanghai Yidong Network Information Co., Ltd. Product Manager November 2002

Java's file random access (RandomaccessFile) is less efficient. By analyzing the cause, a solution is proposed. Gradually show how to create files with cached read and write capabilities, and optimize. It is proven to be used value by visiting the performance of the class with other files.

Main body: The most popular J2SDK version is 1.3 series. Using this version of the developer needs to be randomly accessed, you have to use the RandomaccessFile class. Its I / O performance is far from other common development languages, and it has seriously affects the operational efficiency of the program. Developers need to improve efficiency, analyze the source code of file classes such as RandomaccessFile, find out where the crux is located, and improve optimization, create a "sex / price" random file access class bufferedrandomaccessfile. 1. Before you improve, make a basic test: One-by-item Copy a 12 megabyte file (here involved reading and writing).

Reading and writing time (seconds) RandomaccessFilerandomaccessFile95.848BUFFEREDINPUTSTREAM DATAINPUTSTREAMBUFFEREDOUTPUTSTREAM DATAOUTPUTSTREAM2.935 We can see that the gap between the two is about 32 times, and RandomaccessFile is too slow. First look at the source code, comparative analysis of the two, to find out the reason. 1.1. [Randomaccessfile]

Public Class RandomaccessFile Implements DataOutput, DataInput {

Public final byte readbyte () throws oException {

INT ch = this.read ();

IF (CH <0)

Throw new EOFEXCEPTION ();

Return (BYTE) (CH);

}

Public native int in () THROWS IOEXCEPTION;

Public Final Void WritebyTe (int V) throws oException {

Write (V);

}

Public Native Void Write (Int b).

}

It can be seen that RandomaccessFile needs to perform I / O operations for each read / write one byte. 1.2. [Bufferedinputstream]

Public class bufferedinputstream extends filterinputstream {

Private static int defaultbuffersize = 2048;

protected byte buf []; // Establish a read buffer area

Public BufferedInputStream (InputStream IN, INT Size) {Super (in);

IF (SIZE <= 0) {

Throw New IllegalargumentException ("Buffer Size <= 0");

}

BUF = New byte [size];

}

Public synchronized int} throws oException {

Ensureopen ();

IF (pOS> = count) {

Fill ();

IF (POS> = count)

Return -1;

}

Return BUF [POS ] & 0xFF; // Read directly from BUF []

}

Private void fill () throws oException {

IF (Markpos <0)

POS = 0; / * No Mark: throw away the buffer * /

Else IF (POS> = BUF.LENGTH) / * NO Room Left in buffer * /

IF (Markpos> 0) {/ * Can throw Away Early Part of the buffer * /

INT SZ = POS - Markpos;

System.Arraycopy (BUF, Markpos, BUF, 0, SZ);

POS = SZ;

Markpos = 0;

} else if (buf.length> = marklimit) {

Markpos = -1; / * buffer got Too Big, Invalidate Mark * /

POS = 0; / * DROP BUFFER CONTENTS * /

} else {/ * gren buffer * /

INT NSZ = POS * 2;

IF (NSZ> Marklimit)

NSZ = marklimit;

BYTE NBUF [] = New byte [nsz];

System.Arraycopy (buf, 0, nbuf, 0, pos);

BUF = NBUF;

}

Count = POS;

Int n = in.read (buf, pos, buf.length - pos);

IF (n> 0)

Count = N POS;

}

}

1.3. [Bufferedoutputstream]

Public class bufferedoutputstream extends filteroutputstream {

Protected byte buf []; // Establish a write buffer

Public BufferedOutputStream (OutputStream out, int size) {

Super (OUT);

IF (SIZE <= 0) {

Throw New IllegalargumentException ("Buffer Size <= 0");

}

BUF = New byte [size];

}

Public Synchronized Void Write (int b) throws oException {

IF (count> = buf.length) {

Flushbuffer ();

}

BUF [count ] = (byte) b; // Read directly from BUF []

}

Private void flushbuffer () throws ioException {if (count> 0) {

Out.write (buf, 0, count);

count = 0;

}

}

}

It can be seen that Buffered I / O PutStream is read / write a byte, and the data to be operated directly in the buf; otherwise fills BUF in the corresponding position from the disk [], then directly Most read / write operations are the operation of memory BUF [] for memory BUF []. 1.3. Small junction memory storage time is a nanosecond (10E-9), the disk access time unit is millisecond (10E-3), the same operation is overhead, and the memory is more than 100,000 times. It is theoretically foreseeable, even if the memory operation is 10,000 times, the time spent is far less expensive for disk once I / O overhead. Obviously the latter is by increasing the BUF access to the memory, reducing the overhead of disk I / O, increasing access efficiency, of course, this also increases the overhead of the BUF control section. From practical applications, access efficiency has increased by 32 times. 2. According to the conclusion of 1.3, the present is a buffer read mechanism to the RandomaccessFile class. The random access class is different from the sequence class, the former is created by implementing the DataInput / DataOutput interface, and the latter is created by the expansion filterinputstream / filteroutPutStream, and cannot move directly. 2.1. Opening up buffer BUF [Default: 1024 bytes], used as a read / write shared buffer. 2.2. First achieve read buffer. The basic principle of reading buffer logic: A wants to read a byte of the POS location. Is there existed in B check BUF? If there is, read directly from the BUF and return to the character Byte. C If not, the buf is repositioned to the position where the POS is located and the contents of the buffer byte of the BUFSIZE near the location are fill the buffer, returns B. The following is given below and its description:

Public class bufferedrandomaccessfile extends randomaccessfile {

// Byte Read (Long POS): Read the byte of the current file POS location

// BUFSTARTPOS, BUFENDPOS represents the first / tail offset address of the current file in BUF.

// Curpos refers to the offset address of the current class file pointer.

Public Byte Read (Long PoS) throws oException {

IF (POS this.bufendpos) {

THIS.FLUSHBUF ();

THIS.SEEK (POS);

IF ((POS this.bufendpos))

Throw new oException ();

}

THIS.CURPOS = POS;

Return this.buf [(int) (POS - this.bufstartPOS)];

}

// void flushbuf (): bufdirty is true, the data that has not been written in BUF [] is written to the disk.

Private void flushbuf () throws oException {

IF (this.bufdioty == true) {

IF (super.getfilepointer ()! = this.bufstartpos) {super.seek (this.bufstartpos);

}

Super.write (this.buf, 0, this.bufusedsize);

THIS.BUFDIRTY = FALSE;

}

}

// Void Seek (long POS): Move the file pointer to the POS position and populate the buf [] map to POS

The file block is located.

Public void seek (long pos) throws oews {

IF ((POS this.bufendpos) {// seek Pos Not in Buf

THIS.FLUSHBUF ();

IF ((POS> = 0) && (POS <= this.fileendpos) && (this.filendpos! = 0)))

{// seek pos in file (file length> 0)

This.bufstartPOS = POS * bufbitlen / bufbitlen

THIS.BUFUSEDSIZE = this.fillbuf ();

} else IF (((pOS == 0) && (this.filendpos == 0)))))

|| (POS == this.fileendpos 1)))

{// seek pos is append POS

THIS.BUFSTARTPOS = POS;

this.bufusedsize = 0;

}

THIS.BUFENDPOS = this.bufstartpos this.bufsize - 1;

}

THIS.CURPOS = POS;

}

// int fixbuf (): BUF [] is filled according to BUFSTARTPOS.

Private int fixbuf () throws ioException {

Super.seek (this.bufstartPOS);

THIS.BUFDIRTY = FALSE;

Return super.read (this.buf);

}

}

To this buckle reading basic implementation, one 12 megabytes of files (here involved reading and writing, trying to read with bufferedrandomaccessfile):

Read elapsed time (in seconds) RandomAccessFileRandomAccessFile95.848BufferedRandomAccessFileBufferedOutputStream DataOutputStream2.813BufferedInputStream DataInputStreamBufferedOutputStream DataOutputStream2.935 visible significantly faster, comparable with BufferedInputStream DataInputStream. 2.3. Implement write buffer. The basic principle of writing buffer logic: A wants to write a byte of the file POS location. Is there this mapping in B check BUF? If there is, write directly to the BUF and return to TRUE. C If not, the BUF is repositioned to the location where the POS is located, and the contents of the BufSize bytes near the location are buffer, returns B. The key partial code is given below: // Boolean Write (BYTE BW, Long POS): Write the byte BW to the current file POS location.

/ / According to the difference between POS and the location of the BUF: there is a modification, adding, buf, BUF outside the situation

condition. At the time of logic judgment, the most likely case, first judge, so that the speed can be improved.

// FileEndpos: Indicates the end of the current file, mainly considering additional factors

Public Boolean Write (Byte BW, Long Pos) throws oews {

IF ((POS> = this.bufstartPOS && (POS <= this.bufendpos) {

// Write Pos in buf

THIS.BUF [(int) (POS - this.bufstartPOS)] = BW;

THIS.BUFDIRTY = TRUE;

IF (POS == this.fileendpos 1) {// Write POS is append POS

THIS.FILEENDPOS ;

THIS.BUFUSEDSIZE ;

}

Else {// Write Pos Not in Buf

THIS.SEEK (POS);

IF ((POS> = 0) && (POS <= this.fileendpos) && (this.filendpos! = 0)))

{// Write POS is Modify File

THIS.BUF [(int) (POS - this.bufstartPOS)] = BW;

} else IF (((pOS == 0) && (this.filendpos == 0)))))

|| (POS == this.fileendpos 1)) {// Write Pos is append POS

THIS.BUF [0] = BW;

THIS.FILEENDPOS ;

THIS.BUFUSEDSIZE = 1;

} else {

Throw new indexOutofboundsexception ();

}

THIS.BUFDIRTY = TRUE;

}

THIS.CURPOS = POS;

Return True;

}

At this point, the basic implementation, by-byte COPY a 12 trillion file, (here involve reading and writing, combined with buffering, trying to read / write speed with bufferedrandomaccessfile): Read and write time (second) RandomaccessFileRandMaccessFile95.848bufferedInputStream DataInputStreamBufferedOutputStream DataOutputStream2.935BufferedRandomAccessFileBufferedOutputStream DataOutputStream2.813BufferedRandomAccessFileBufferedRandomAccessFile2.453 visible integrated read / write speeds beyond BufferedInput / OutputStream DataInput / OutputStream. 3. Optimize bufferedrandomaccessfile. Optimization principle:

The frequent statement is mostly optimized, and the effect is most effective. When multiple nested logic judges, the most likely judgment should be placed in the outermost layer. Reduce unnecessary New. Here is a typical example:

Public void seek (long pos) throws oews {

...

This.bufstartPOS = POS * bufbitlen / bufbitlen

// buffitlen refers to the length of BUF [], for example, if bufsize = 1024, bufbitlen = 10.

...

}

The SEEK function is used in each function. The call is very frequent. The above-mentioned line statement determines the maf [] corresponding to the mapping position of the current file according to POS and BUFSIZE, and it is obviously not a good way to use "*", "/". Optimization 1: this.bufstartpos = (POS << Buffitlen) >> Bufbitlen; Optimization 2: this.bufstartPos = POS & BUFMASK; // this.bufmask = ~ ((long) this.bufsize - 1); Both efficiency It is better than the original, but the latter is obviously better, because the former needs two shift operations, the latter only needs a logic and operation (BUFMask can be presented). At this way, the basic implementation is optimized, and one 12 megabytes of files are related (here, reading and writing, combining the buffering reading, after optimizing BUFFEREDRANDOMACCESSFILE Try Read / write speed):

Read elapsed time (in seconds) RandomAccessFileRandomAccessFile95.848BufferedInputStream BufferedOutputStream DataOutputStream2.935BufferedRandomAccessFileBufferedOutputStream DataOutputStream2.813BufferedRandomAccessFileBufferedRandomAccessFile2.453BufferedRandomAccessFile preferably BufferedRandomAccessFile preferably optimized despite obvious visible DataInputStream 2.197, or faster than the non-optimized before some, perhaps this effect on older machines Will be more obvious. The above is more access, even if it is random access, in most cases, there is still a BYTE, so the buffer mechanism is still valid. The general order access is not easy to achieve random access. 4. Provide file additional functionality required to improve: public boolean append (byte bw) THROWS IOEXCEPTION {

Return this.write (BW, this.fileendpos 1);

}

Provide file current location modification function:

Public Boolean Write (Byte BW) THROWS IOEXCEPTION {

Return this.write (BW, this.curpos);

}

Returns the length of the file (due to the reason for BUF reading, the original RandomaccessFile class is different):

Public long length () throws oException {

Return this.max (this.filendpos 1, this.initfilelen);

}

Returns the current pointer (due to the reason why it is written through BUF, the original RandomaccessFile class is different):

Public long getfilepointer () throws oException {

Return this.curpos;

}

Provides a buffering system for multiple bytes of the current location:

Public void write (byte b []t)....................

Long WriteEndpos = this.curpos len - 1;

IF (WriteEndPos <= this.bufendpos) {// b [] in Cur Buf

System.Arraycopy (B, Off, this.buf, (int) (this.curpos - this.bufstartpos),

Len;

THIS.BUFDIRTY = TRUE;

this.bufusedsize = (int) (WriteEndPos - this.bufstartPOS 1);

} else {// b [] Not in cur buf

Super.seek (this.curpos);

Super.write (B, OFF, LEN);

}

IF (WriteEndPos> this.fileendpos)

THIS.FILEENDPOS = WriteEndpos;

This.seek (WriteEndPos 1);

}

Public void write (byte b []) throws oewception {

THIS.WRITE (B, 0, B.LENGTH);

}

Provide a buffer reading function for multiple bytes of the current location: public int in (Byte B [], INT OFF, INT LEN) THROWS IOEXCEPTION {

Long readENDPOS = this.curpos len - 1;

IF (READENDPOS <= this.bufendpos && readypos <= this.fileendpos) {

// read in buf

System.ArrayCopy (this.buf, (int) (this.curpos - this.bufstartpos),

B, OFF, LEN);

} else {// read b [] size> buf []

IF (readEndPos> this.fileendpos) {// read b [] part in file

Len = (int) (this.Length () - this.curpos 1);

}

Super.seek (this.curpos);

Len = Super.read (B, OFF, LEN);

ReadendPos = this.curpos len - 1;

}

THIS.SEEK (ReadendPos 1);

Return Len;

}

Public int in (byte b []) throws oException {

Return this.read (B, 0, B.Length);

}

Public void setLength (long newlength) throws oException {

IF (Newlength> 0) {

THIS.FILEENDPOS = NewLength - 1;

} else {

THIS.FILEENDPOS = 0;

}

Super.setLength;

}

Public void close () throws oException {

THIS.FLUSHBUF ();

Super.close ();

}

The perfect work is basically completed, try the new multi-byte read / write function, read / write 1024 bytes at the same time, come to COPY a 12 megabytes, (here involve reading and writing, using the BufferedrandomaccessFile test The speed of reading / writing):

Read elapsed time (in seconds) RandomAccessFileRandomAccessFile95.848BufferedInputStream DataInputStreamBufferedOutputStream DataOutputStream2.935BufferedRandomAccessFileBufferedOutputStream DataOutputStream2.813BufferedRandomAccessFileBufferedRandomAccessFile2.453BufferedRandomAccessFile preferably BufferedRandomAccessFile 2.197BufferedRandomAccessFile preferably 0.401 End End BufferedRandomAccessFile 5. Contrast with JDK1.4 new class mappedbytebuffer randomaccessfile? JDK1.4 provides a NIO class, where the MappedByteBuffer class is used to map buffers, or the random file access is mapped, and the Java designer also sees the problem of randomaccessfile and improves. How to copy files via mappedbytebuffer randomaccessfile? Below is the main part of the test program: randomaccessfile rafi = new randomaccessfile (srcfile, "r");

RandomaccessFile Rafo = New RandomaccessFile (Desfile, "RW");

Filechannel FCI = Rafi.getChannel ();

Filechannel fco = rafo.getchannel ();

Long size = fci.size ();

MappedBytebuffer Mbbi = fci.map (filechannel.mapmode.read_only, 0, size);

MappedBytebuffer Mbbo = fco.map (filechannel.mapmode.read_write, 0, size);

Long Start = system.currenttimemillis ();

For (int i = 0; i

BYTE B = Mbbi.get (i);

Mbbo.Put (i, b);

}

Fcin.close ();

Fcout.close ();

Rafi.Close ();

Rafo.close ();

System.out.println ("Spend:" (Double) (SPENT) / 1000 "S");

Try the mapping of JDK1.4 to buffer reading / write function, one byte copy a 12 trillion file, (here involving reading and writing):

Read elapsed time (in seconds) RandomAccessFileRandomAccessFile95.848BufferedInputStream DataInputStreamBufferedOutputStream DataOutputStream2.935BufferedRandomAccessFileBufferedOutputStream DataOutputStream2.813BufferedRandomAccessFileBufferedRandomAccessFile2.453BufferedRandomAccessFile excellent BufferedRandomAccessFile excellent 2.197BufferedRandomAccessFile finished BufferedRandomAccessFile finished 0.401MappedByteBuffer RandomAccessFileMappedByteBuffer RandomAccessFile1.209 really good, seems to have a very JDK1.4 than 1.3 Big progress. If you use a 1.4 version development software in the future, you need to random access to the file, which is recommended to use the mappedBytebuffer RandomaccessFile. However, in view of the programs currently using JDK1.3 and previous version of the program account for the vast majority of the actual situation, if your development Java program uses the RandomaccessFile class to randomly access the file, because of its poor performance, it is worried that the user is sick. Please try the BufferedRandomaccessFile class provided herein, do not have to overtrer, just import this class, change all randomaccessfile to bufferedrandomaccessfile, your program's performance will be greatly improved, you have to do this is as simple as it is. 6. Future Considers readers can establish multi-page caching and cache phase-out mechanisms on this basis to cope with applications where random access intensity is. Related Resources: Sun JDK1.3 / 1.4 SRC. Source code About the author of Cui Zhixiang, Shanghai Tita Network Information Co., Ltd. Product Manager, engaged in software product development. You can get in touch with him via e-mail: bladeinco@citiz.net.

转载请注明原文地址:https://www.9cbs.com/read-87083.html

New Post(0)