MSDN online teaching - use C #: open packaging! Come!
Please visit the MSDN Source Code and download the source code (English) in this column article. Last month, we introduced a method of packing and unpacking boxes, and when they used them. This month, we will study the impact of the box and how we should reduce this effect to the smallest. The boxes and performances are very simple because of the boxes, so the object model in C # is very simple. However, the use of the packaged numerical type can result in a decrease in performance. In most cases, the simplification of the object model is more important. This is true for a general software. The time saving development and maintenance software is the most important place to optimize, but it is the performance of these optimization measures to maximize the performance of the program. The best solution may be using a universal arraylist. This allows us to declare an arraylist
{
FOREACH (String Word in RegexSplit.split (line.tolower ())))
{
INT count = 0;
Object value = wordtable [word];
IF (value! = null)
Count = (int) value;
WordTable [Word] = Count 1;
}
}
In the internal loop, we get the current value of a certain keyword from the "Hash Table". If the value is not a null value, it converts it into an INT. The correct value is then stored in the "Half Table". Writing this code is easy, but if a word already exists, it will cause considerable performance loss. Just add a certain value, we must not only disclose it, but also calculate two hash code for each string. Although these losses are generated, the performance of this program is still quite good. In order to get some specific performance indicators, I need to use some suitable text files. I first downloaded "On Basilisk Station" written by David Weber, including 160,000 words. This file seems to be small for a more comprehensive test. Programs written with Perl can be processed in less than a second time. Then I downloaded "War and Peace" from Project Gutenberg (English), including approximately 600,000 words. This is a little better. The program written with Perl took about 4 seconds, completed the statistics of the "War and Peace". It took approximately 10 seconds for the C # program that was deposted. Handling 60,000 words per second is very good, but I am very interested in raising its speed to how much. Determine the baseline before we start trying different ways, we need to know how much the reference time of this C # program is. In other words, read all rows of the file and break down into one word without the word statistics, this process takes more time. This is very important because there is only the time we can only compare the time used in the source code. If we write code that implements the above tasks, we found that all tasks other than the statistics are completed in about 7 seconds. This means that the word statistics take a few seconds. Let the program run faster our goal is to cancel the packing of the integer data and the packing process. In other words, we want to be able to discover the integer data and then directly accumulate the value in the "His Table" in the process of reloading the box. To do this, an obvious way is to use a reference type, not a numeric type. We can pack your integer in the class. This class is very simple: Class intersillerclass {
INT country;
Public intelilderclass ()
{
count = 1;
}
Public Int Count
{
get
{
Return (count);
}
set
{
Count = value;
}
}
Public override string toString ()
{
Return (count.toString ());
}
}
When this class creates a new instance, the statistics are set to 1. The statistical value can be incremented by count attributes. The main loop is modified as the following form: Foreach (String Word in RegexSplit.split (line.tolower ()))
{
WordCount ;
INTHOLDERCLASS VALUE = (intholderclass) WordTable [Word];
IF (value == null)
{
WordTable [Word] = new intersillclass ();
}
Else
Value.count ;
}
}
If the word does not exist, create a new packing class instance and put it in a "hash table". If the word already exists, it is accumulated that the count is added. When we run this version, we found that the time spent less than a second. This is approximately 30% of the packing procedure. I just started a little surprised, because the use of the class should cause more loss on the surface. But after careful thinking, we know, it is clear, although it is the same, it is the same, but the loss caused by creating a box class, but for the packaged Int, we are for each word (total 600,000 A boxes are created instead of each unique word (approximately 19,000). The degree of performance improvement in this technique depends on the number of operations that we have performed when the value is saved in the set class. If we just find this object and take it out, then there will be no improvement in any performance. For example, if I use an ArrayList to save the integer value for subsequent processing, then this technology will not be helpful for performance. When you write code, it is best to use as easy to easily. After implementing this method, if a faster speed is required, other methods can be considered to improve the running speed. Another technology has a similar approach to obtain the same result. The numeric type can be used as an interface, but because the interface is a reference type, you can only use an interface that references a boxed numerical type. We can define an interface with an increment () member and use this numeric type to implement it. With this method, we can get the interface directly by the packaged value type, then call the increment () function, which does not need to be removed. The interfaces and numerical types are as follows: Interface Iincrement {
Void inccess ();
}
Struct INTHOLDERSTRUCT: IINCREMENT
{
Int value;
Public interstruct (int value)
{
THIS.VALUE = VALUE;
}
Public Int Value
{
get
{
Return (Value);
}
}
Public void increment ()
{
Value ;
}
Public override string toString ()
{
Value.toTString ());
}
}
I hope that this is done independently of myself, but in fact it is written in a participant I have encountered at the recent meeting. This type of main work mode is very similar to that of the program that adopts packages. The time spent at this program is approximately 32% of the procedure that is packaged intment. In other words, it is only slightly slower than the program that is packaged. All results The following is a summary of all results. Table 1 Results of "On Basilisk Station"
The time used in the implementation is contained with the proportion of the contained int 0.641.00 packages 0.280.43 Packing structure and interface 0.290.45 Table 2 "War and Peace"
Implementation time The time used with respect to the proportion of the package INT is packaged int3.011.00 Package 0.920.31 Packing structure and interface 0.970.32 Various implementation and driver code can be found in the sample code. In order to make it too big during archiving, I will save these text files here. I also quoted a Perl file as a reference. If you are running this Perl program, you need to download ActivePerl software from http://www.activestate.com (this is a free software). Summary I hope that you like this about packing. Typically, although packages can lead to some loss of performance, there is no great relationship because it is more important than the simplification. But sometimes it may be necessary to use a packing class to reduce performance loss due to packing. Website Touching bags have a lot of sites in this touching bag, so I use my secret random number generator to pick out five. They are: csharpfree.com csharpindex.com C # Corner the .NET Enhance Project Technical Lead needs to point out that these are un guaranteed sites. What I can guarantee is that when you click on the URL address above, you will display a web page.
Eric Gunnerson is a QA leader of the C # compiler group, member of the C # design group, and the author of "a Programmer's Introduction to C #" book book. He is working in programming, and he even knows what is an 8-inch disk, and it can also use one hand to disk.