Cute Python: With Psyco to make Python run like C

zhaozj2021-02-16  79

Cute Python: With Psyco to make Python run like C

English original

content:

How does Psyco works using Psycopsyco performance Psyco? Reference information about the author's evaluation

related information:

Introduction to Neural Networks

In the Linux area:

Tutorial Tools & Product Codes & Component Articles

Use psyco: Python Special Compiler David Mertz, Dr. Mertz@gnosis.cx, Gnosis Software, Inc. 2002 October

Python's design is similar to Java design in many ways. Both use the virtual machine that explains dedicated pseudo-translational character code. JVM is more advanced than Python is to optimize the execution of the bytecode. Psyco, a Python special compiler, helping to balance this competition. Psyco is now an external module, but it may include in the python itself in the future. It is only a small amount of additional programming, usually, you can use PSYCO to increase the speed of the Python code to several orders. In this article, David Mertz studied what psyco and test it in some applications.

Python is usually fast enough for what you want to do. Programming a newcomer for the interpreted / byte compilation language like Python, which is quite naive in terms of execution speed. On the latest hardware, most non-optimized Python programs run the speed and the speed of the required speed, and it takes additional programming to make the application run faster. Therefore, in this article, I am only interested in other than one percent. Sometimes the Python program (or programs written in other languages) will also run extremely slow. The improvement of different purposes is very different; raising the performance of only a few milliseconds of the task is very eye-catching, but speeding up those who need to run a few minutes, hours, days or even a few days, usually very worth it. Moreover, it should be noted that all the reasons why all tasks run is caused by the CPU. For example, if you take a database query to spend a few hours, then the processing result data set will take a minute or two minutes. This article does not discuss issues related to I / O. There are many ways to speed up the Python program. The first technique that each programmer should think is to improve the algorithm and data structure used. Micro optimization of the low-efficiency calculation step is the inconsistent thing. For example, if the complexity of the current technology is O (N ** 2), these steps accelerate 10 times far away from finding O (N) alternatives. Even when considering the extreme cases of assembly language, this idea is also applicable: The correct algorithm in Python usually is much better than the wrong algorithm in the compilation language of manual tuning. The second technology that you should first consider is to analyze your Python application, focus on rewriting key parts into C expansion modules. Using extended wraps like SWIG (see Resources), you can create C extensions, which performs the most time consuming element in the program as C code. Extend Python relatively simple in this way, but take some time to learn (and need to understand C "). You often find that the time spending the implementation of the Python app is only spent on several functions, so this extension may have a very considerable "results". The third technology is based on the second technology. Greg Ewing has created a language called Pyrex, which combines Python and C. In particular, use Pyrex, you need to write functions similar to Python, which adds a type declaration to the selected variable. Pyrex processes the ".pyx" file into a ".c" extension file. Once compiled with the C compiler, these PyRex (language) modules can be imported into regular Python applications and use. Due to the syntax used by Pyrex and the syntax of Python itself (including loop, branch, and abnormal statement, assignment, etc.), the Pyrex programmer does not need to learn to write extensions with C. Moreover, PyRex allows PyRex to mix C-level variables and Python level variables (objects) compared to Download Extensions Directly Use C. The last technology is the subject of this article. The expansion module PSYCO can insert the inside of the Python interpreter, and can selectively replace the partial Python interpretation bytecode with an optimized machine code. Unlike other techniques described, PSYCO is strictly operating at Python runtime. That is, the Python source code is compiled by the python command, and the way used and previously identical (in addition to several IMPORT statements and function calls added to the PSYCO).

However, when the Python interpreter runs the application, Psyco will check from time to time to see if there are some special machine code to replace the conventional Python bytecode operation. This dedicated compilation and the operation of the Java instant compiler are very similar (generally, at least the case), and is specific to the architecture. So far, PSYCO can only be used for the I386 CPU architecture. The PSYCO is that you can use the Python code you have been writing (exactly the same!), But you can make it run faster. How does Psyco work to fully understand Psyco, you may need to have a good master_frame () function of the Python interpreter and I386 assembly language. Unfortunately, I can't express the expert's opinion on any of these, but I think I can outline the psyco. In conventional Python, the EVAL_FRAME () function is the internal cycle of the Python interpreter. The EVAL_FRAME () function primarses to perform the current bytecode in the context, and switch the control outward to a function that is suitable for implementing the bytecode. The specific details of the support function will usually depend on the status of various Python objects saved in memory. Simply said that adding Python object "2" and "3" and add object "5" and "6" will produce different results, but both operations are allocated in a similar way. PSYCO uses a composite evaluation unit to replace the evage_frame () function. There are several ways to improve Python in Pysyco. First, Psyco will operate into a bit optimized machine code; because the machine code needs to be completed and the Python's dispatch function is the same, it is only a few performed. Moreover, "special" content in psyco compilation is more than just the choice of Python bytecode, but PSYCO also specifies the variable value known in the context. For example, in the code similar to the following, the variable X is known in the cycle duration: x = 5 L = [] for i in ]ge (1000): L.Append (x * i) Optimized version of this code Do not need to use the "content of the X variable / object" by each I, compared to the other, with 5 multiplier with less overhead used in each I, omitting the step of finding / indirect reference. In addition to the I386 code, PSYCO is also reused in the future. If Psyco can identify specific operations and earlier ("Specialized") operations, it can rely on this cache code without need to compile the code segment. This saves some time. However, the reason for truly time in Psyco is that PSYCO divides operation into three different levels. For Psyco, there is "runtime", "compile time" and "virtual time" variables. Psyco increases and reduces the level of variables as needed. Runtime variables are just the original bytecode and object structure for regular Python interpreter processing. Once Psyco is compiled into a machine code, the compilation variable is represented in the machine register and the direct access to the memory location. The most interesting level is the virtual time variable. Internally, a Python variable is a complete structure with many members - even when the object represents an integer. The PSYCO virtual time variable represents the Python object that may be constructed when needed, but the details of these objects are ignored before they become Python objects. For example, consider the following assignment:

X = 15 * (14 (13 - (12/11))) Standard Python will build and destroy many objects to calculate this value. Build a complete integer object to save (12/11) value; then "pull" from the structure of the temporary object and use it to calculate the new temporary object (13-pyint). Psyco skips these objects, only calculates these values ​​because it knows "if needed" can create an object from values. Using Psyco explains that Psyco is relatively difficult, but it is very easy to use Psyco. Basically, all of its content is to tell the PSYCO module which function / method is "specialized". No changes in any Python function and class itself do not need to be changed. There are several ways to specify what Psyco should do. "Shotgun" method allows you to use PSYCO instant operation everywhere. To do this, place the following lines at the top of the module: Import psyco; psyco.jit () from psyco.jit () from psyco.classes IMPORT * The first line tells Psyco to "play its magic" for all global functions. The second line (in Python 2.2 and above) tells Psyco to perform the same operation on the class method. In order to more accurately determine the behavior of Psyco, you can use the following command:

Psyco.bind (somefunc) # or method, class newname = psyco.proxy (func) The second form uses the FUNC as a standard Python function, but optimizes the call involving newName. In addition to almost everything other than testing and debugging, you will use psyco.bind (). Psyco's performance despite this magical, use it still needs a little thinking and testing. Mainly, it is to understand that Psyco is useful for processing multiple cycles, and it knows how to optimize the operation involving integers and floating point numbers. For non-cyclic functions and other types of objects, PSYCO will only increase its analysis and internal compilation overhead. Moreover, for applications containing a large number of functions and classes, PSYCO is enabled throughout the application range, which will increase a large amount of burden on the compilation of machine code and for this cache. There is a selectively bound function that can get the maximum income from Psyco's optimization, which will be much better. I started my testing process in a very naive way. I just consider I have run, but I have not considered accelerated applications. The first example of thinking is to convert the text processing in python into the Latex format text operation program. The app uses some string methods, some regular expressions and some mainly driven program logic that mainly matches the regular expression and string. In fact, it is a bad choice for test candidates for Psyco, but I still use it. In the first pass, what I did is to add psyco.jit () to the top of the script. This is not expensive at all. Unfortunately, the result (expectation) is very disappointing. The original script runs for 8.5 seconds, and it is about to run 12 seconds after "Acceleration" of Psyco. It's really bad! I guess that the startup overhead required for immediate compilation is dragged. Therefore, I will try to deal with a larger input file (consisting of multiple copies of the original input file). This time I got a small success, minus running from about 120 seconds to 110 seconds. The acceleration effect in several operations is compared, but the effect is not significant. Text processing of the second pass test. I only added psyco.bind (main), not adding a total psyco.jit () call because the main () function is indeed multiple times (but only using the least integer operation). The results here are better than the front. This approach cuts the normal run time in a few seconds, cuts a few seconds in the case of a larger input version. But there is still not introduced the result of the eye (but there is no harm). For more appropriate PSYCO tests, I searched for some neural network code I have written in the previous article (see "Resources"). This "code recognizer" application can be used "Training" to identify possible distribution of different ASCII values ​​written in different programming languages. Similar to such things may be useful in guessing file types (more than network packets) will be useful; however, about "training", the code is actually fully general - it can easily learn to identify faces, Sound or tidal mode. In any case, "Code Appreciator" is based on the Python library BPNN, and the PSYCO 4.0 distribution version is also included (in a corrected form) The library is used as a test case. In this article, the "Code Recipe" is important to understand it to do a lot of floating point calculations and spend a long run time. Here we already have a good candidate example that can be used for PSYCO testing. After using a period of time, I built some details about the PSYCO usage.

There is no difference between this instant binding or target binding for this application that has only a small number of classes and functions. But the best result is that several percentages of improvements can be obtained by selectively binding optimization classes. However, it is more important to understand the scope of PSYCO binding, this is important. The code_recognizer.py script includes these rows similar to: # Customized Output Methods, Math Core Inherited that is, from Psyco's point of view, interesting things are in bpnn.nn . What role does not play psyco.jit () or psyco.bind (nn2) to the code_recognizer.py script. To make Psyco's desired optimization, you need to add psyco.bind (nn) to code_recognizer.py or add psyco.jit () to bpnn.py. In contrast to you may assume, instant optimization does not occur when the instance is created or the method is runtime, but is in the scope of the defined class. In addition, binding derivation classes do not speculate in their inheritance from other places. Once the appropriate PSYCO binding is found, the acceleration effect is quite obvious. With the same test case and training method (500 training modes, 1000 training) provided in the reference article (500 training modes, 1000 training temporary), neural network training time is reduced from about 600 seconds from about 600 seconds - the speed is more than 3 times. The number of iterations is lowered to 10, and the multiple of acceleration is also reduced (but the identification capacity of the neural network is invalid), the intermediate value of iteration changes. I found that using two lines of new code will reduce the runtime to more than 10 minutes, the effect is very significant. This acceleration may still be slower than the similar application written in C, and it is certainly more slower than 100 times the acceleration reflected than several independent PSYCO test cases. But this application is quite "real", and these improvements in many environments have been sufficient. Where will PSYCO go? Psyco does not perform any type of internal statistics or profiling, only minimal optimization of the generated machine code. Maybe future versions will know how to make Python that can really benefit from Python and discard it into a machine code that is not optimal. In addition, Maybe in the future, Psyco may determine the more wide (but also more expensive) optimization of the labor operation. This runtime analysis may be similar to the work made by Sun's HotSpot technology for Java. Java is unlike Python, it has type declarations, but this factually does not have many people to imagine (but the work done before the optimization of Self, SmallTalk, Lisp, and Scheme). If you want to integrate the PSYCO type to a future version of Python itself, how exciting, even though I suspect that this will never really happen. Adding a few lines of importing and binding code don't need to do a lot of work, but you can easily let Python run more faster than before. We will see this. Reference

Find more information on the SourceForge's Psyco home page and project page. Simplified Packaging and Interface Generator (Swig) is a tool for writing C / C modules for Python and other "scripting" languages ​​for a wide range (perhaps dominating). Greg ewing has created a Pyrex language and uses it to write a Python extension module. Pyrex's main purpose is to define a language that looks close to Python itself, which allows mixing of Python and C data types, but ultimately converts them and compiles to Python C extension. John Max Skaller's Vyper language intends to be an enhanced python, which is implemented with OCAML. The result that is desired in this project is to compile the same machine code as the machine code generated by OCAML, which is usually comparable to the speed of C. Unfortunately, Vyper is a died of the project that never completed the compiling version. Please reread the project during the development of David's interview with Skaller (DeveloperWorks, October 2000). David and Andrew Blais combined with an Introduction to Neural Networks (DeveloperWorks, July 2001). In that article, they offer some code based on Neil Schemenauer's Python Module BPNN. This article is now using the neural network code to demonstrate the function of Psyco. The BPNN module is included in the current PSYCO distribution version as a test case. Here is its original module. Find more Linux articles in the developerWorks Linux area. Regarding the failure of David Mertz as a hunter, fisherman and shepherd, leading him to commentary criticism. He may try something else tomorrow. You can contact him through mertz@gnosis.cx; understand his life on http://gnosis.cx/publish/. Welcome to publish advice and comments on past, present or future columns.

转载请注明原文地址:https://www.9cbs.com/read-16246.html

New Post(0)