In this article we go over the Python file types .pyc, .pyo and .pyd, and how they're used to store bytecode that will be imported by other Python programs.
You might have worked with .py files writing Python code, but you want to know what these other file types do and where they come into use. To understand these, we will look at how Python transforms code you write into instructions the machine can execute directly.
Bytecode and the Python Virtual Machine
Python ships with an interpreter that can be used as a REPL (read-eval-print-loop), interactively, on the command line. Alternatively, you can invoke Python with scripts of Python code. In both cases, the interpreter parses your input and then compiles it into bytecode (lower-level machine instructions) which is then executed by a "Pythonic representation" of the computer. This Pythonic representation is called the Python virtual machine.
However, it differs enough from other virtual machines like the Java virtual machine or the Erlang virtual machine that it deserves its own study. The virtual machine, in turn, interfaces with the operating system and actual hardware to execute native machine instructions.
The critical thing to keep in mind when you see .pyc, .pyo and .pyd file types, is that these are files created by the Python interpreter when it transforms code into compiled bytecode. Compilation of Python source into bytecode is a necessary intermediate step in the process of translating instructions from source code in human-readable language into machine instructions that your operating system can execute.
Throughout this article we'll take a look at each file type in isolation, but first we'll provide a quick background on the Python virtual machine and Python bytecode.
The .pyc File Type
We consider first the .pyc file type. Files of type .pyc are automatically generated by the interpreter when you import a module, which speeds up future importing of that module. These files are therefore only created from a .py file if it is imported by another .py file or module.
Here is an example Python module which we want to import. This module calculates factorials.
# math_helpers.py
# a function that computes the nth factorial, e.g. factorial(2)
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)
# a main function that uses our factorial function defined above
def main():
print("I am the factorial helper")
print("you can call factorial(number) where number is any integer")
print("for example, calling factorial(5) gives the result:")
print(factorial(5))
# this runs when the script is called from the command line
if __name__ == '__main__':
main()
Now, when you just run this module from the command line, using
python math_helpers.py
, no .pyc files get created.
Let's now import this in another module, as shown below. We are importing the factorial function from the math_helpers.py file and using it to compute the factorial of 6.
# computations.py
# import from the math_helpers module
from math_helpers import factorial
# a function that makes use of the imported function
def main():
print("Python can compute things easily from the REPL")
print("for example, just write : 4 * 5")
print("and you get: 20.")
print("Computing things is easier when you use helpers")
print("Here we use the factorial helper to find the factorial of 6")
print(factorial(6))
# this runs when the script is called from the command line
if __name__ == '__main__':
main()
We can run this script by invoking
python computations.py
at the terminal. Not only do we get the result of 6 factorial, i.e. 720, but we also notice that the interpreter automatically creates a math_helpers.pyc file. This happens because the computations module imports the math_helpersmodule. To speed up the loading of the imported module in the future, the interpreter creates a bytecode file of the module.
When the source code file is updated, the .pyc file is updated as well. This happens whenever the update time for the source code differs from that of the bytecode file and ensures that the bytecode is up to date.
Note that using .pyc files only speeds up the loading of your program, not the actual execution of it. What this means is that you can improve startup time by writing your main program in a module that gets imported by another, smaller module. To get performance improvements more generally, however, you'll need to look into techniques like algorithm optimization and algorithmic analysis.
Because .pyc files are platform independent, they can be shared across machines of different architectures. However, if developers have different clock times on their systems, checking in the .pyc files into source control can create timestamps that are effectively in the future for others' time readings. As such, updates to source code no longer trigger changes in the bytecode. This can be a nasty bug to discover. The best way to avoid it is to add .pyc files to the ignore list in your version control system.
The .pyo File Type
The .pyo file type is also created by the interpreter when a module is imported. However, the .pyo file results from running the interpreter when optimization settings are enabled. If you remember from the previous example, we will need to import this module to make use of it. In the following code listing, we import lambdas.py and make use of the g lambda.
# using_lambdas.py
# import the lambdas module
import lambdas
# a main function in which we compute the double of 7
def main():
print(lambdas.g(7))
# this executes when the module is invoked as a script at the command line
if __name__ == '__main__':
main()
Now we come to the critical part of this example. Instead of invoking Python normally as in the last example, we will make use of optimization here. Having the optimizer enabled creates smaller bytecode files than when not using the optimizer.
To run this example using the optimizer, invoke the command:
$ python -O using_lambdas.py
Not only do we get the correct result of doubling 7, i.e. 14, as output at the command line, but we also see that a new bytecode file is automatically created for us. This file is based on the importation of lambdas.py in the invocation of using_lambdas.py. Because we had the optimizer enabled, a .pyo bytecode file is created. In this case, it is named lambdas.pyo.
The optimizer, which doesn't do a whole lot, removes assert statements from your bytecode. The result won't be noticeable in most cases, but there may be times when you need it.
Also note that, since a .pyo bytecode file is created, it substitutes for the .pyc file that would have been created without optimization. When the source code file is updated, the .pyo file is updated whenever the update time for the source code differs from that of the bytecode file.
The .pyd File Type
The .pyd file type, in contrast to the preceding two, is platform-specific to the Windows class of operating systems. It may thus be commonly encountered on personal and enterprise editions of Windows 10, 8, 7 and others.
In the Windows ecosystem, a .pyd file is a library file containing Python code which can be called out to and used by other Python applications. In order to make this library available to other Python programs, it is packaged as a dynamic link library.
Dynamic link libraries (DLLs) are Windows code libraries that are linked to calling programs at run time. The main advantage of linking to libraries at run time like the DLLs is that it facilitates code reuse, modular architectures and faster program startup. As a result, DLLs provide a lot of functionality around the Windows operating systems.
A .pyd file is a dynamic link library that contains a Python module, or set of modules, to be called by other Python code. To create a .pyd file, you need to create a module named, for example, example.pyd. In this module, you will need to create a function named
PyInit_example()
. When programs call this library, they need to invoke import foo
, and the PyInit_example()
function will run.
For more information on creating your own Python .pyd files, check out this article.
Differences Between These File Types
While some similarities exist between these file types, there are also some big differences. For example, while the .pyc and .pyo files are similar in that they contain Python bytecode, they differ in that the .pyo files are more compact thanks to the optimizations made by the interpreter.
The third file type, the .pyd, differs from the previous two by being a dynamically-linked library to be used on the Windows operating system. The other two file types can be used on any operating system, not just Windows.
Each of these file types, however, involve code that is called and used by other Python programs.
Conclusion
In this article we described how each special file type, .pyc, .pyo, and .pyd, is used by the Python virtual machine for re-using code. Each file, as we saw, has its own special purposes and use-cases, whether it be to speed up module loading, speed up execution, or facilitate code re-use on certain operating systems.
(Source: stackoverflow.com)
--------------------------------------------------------------------------------------------------------
Is a PYD file the same as a DLL?
Yes, .pyd files are dll’s, but there are a few differences. If you have a DLL named spam.pyd, then it must have a function initspam(). You can then write Python “import spam”, and Python will search for spam.pyd (as well as spam.py, spam.pyc) and if it finds it, will attempt to call initspam() to initialize it.
Note that the search path for spam.pyd is PYTHONPATH, not the same as the path that Windows uses to search for spam.dll. Also, spam.pyd need not be present to run your program, whereas if you linked your program with a dll, the dll is required. Of course, spam.pyd is required if you want to say “import spam”. In a DLL, linkage is declared in the source code with __declspec(dllexport). In a .pyd, linkage is defined in a list of available functions.
CATEGORY: windows
----------------------------------------------------------------------------------------------
- .py - Regular script
- .py3 - (rarely used) Python3 script. Python3 scripts usually end with ".py" not ".py3", but I have seen that a few times
- .pyc - compiled script (Bytecode)
- .pyo - optimized pyc file (As of Python3.5, Python will only use pyc rather than pyo and pyc)
- .pyw - Python script to run in Windowed mode, without a console; executed with pythonw.exe
- .pyx - Cython src to be converted to C/C++
- .pyd - Python script made as a Windows DLL
- .pxd - Cython script which is equivalent to a C/C++ header
- .pxi - MyPy stub
- .pyi - Stub file (PEP 484)
- .pyz - Python script archive (PEP 441); this is a script containing compressed Python scripts (ZIP) in binary form after the standard Python script header
- .pywz - Python script archive for MS-Windows (PEP 441); this is a script containing compressed Python scripts (ZIP) in binary form after the standard Python script header
- .py[cod] - wildcard notation in ".gitignore" that means the file may be ".pyc", ".pyo", or ".pyd".
A larger list of additional Python file-extensions (mostly rare and unofficial) can be found at http://dcjtech.info/topic/python-file-extensions/
--------------------------------------------------------------------------------------------------------------
Python on Windows FAQ¶
Contents
- Python on Windows FAQ
- How do I run a Python program under Windows?
- How do I make Python scripts executable?
- Why does Python sometimes take so long to start?
- How do I make an executable from a Python script?
- Is a
*.pyd
file the same as a DLL? - How can I embed Python into a Windows application?
- How do I keep editors from inserting tabs into my Python source?
- How do I check for a keypress without blocking?
How do I run a Python program under Windows?
This is not necessarily a straightforward question. If you are already familiar with running programs from the Windows command line then everything will seem obvious; otherwise, you might need a little more guidance.
Unless you use some sort of integrated development environment, you will end up typing Windows commands into what is variously referred to as a “DOS window” or “Command prompt window”. Usually you can create such a window from your search bar by searching for
cmd
. You should be able to recognize when you have started such a window because you will see a Windows “command prompt”, which usually looks like this:
The letter may be different, and there might be other things after it, so you might just as easily see something like:
depending on how your computer has been set up and what else you have recently done with it. Once you have started such a window, you are well on the way to running Python programs.
You need to realize that your Python scripts have to be processed by another program called the Python interpreter. The interpreter reads your script, compiles it into bytecodes, and then executes the bytecodes to run your program. So, how do you arrange for the interpreter to handle your Python?
First, you need to make sure that your command window recognises the word “py” as an instruction to start the interpreter. If you have opened a command window, you should try entering the command
py
and hitting return:
You should then see something like:
You have started the interpreter in “interactive mode”. That means you can enter Python statements or expressions interactively and have them executed or evaluated while you wait. This is one of Python’s strongest features. Check it by entering a few expressions of your choice and seeing the results:
Many people use the interactive mode as a convenient yet highly programmable calculator. When you want to end your interactive Python session, call the
exit()
function or hold the Ctrl key down while you enter a Z, then hit the “Enter” key to get back to your Windows command prompt.
You may also find that you have a Start-menu entry such as
that results in you seeing the >>>
prompt in a new window. If so, the window will disappear after you call the exit()
function or enter the Ctrl-Z character; Windows is running a single “python” command in the window, and closes it when you terminate the interpreter.
Now that we know the
py
command is recognized, you can give your Python script to it. You’ll have to give either an absolute or a relative path to the Python script. Let’s say your Python script is located in your desktop and is named hello.py
, and your command prompt is nicely opened in your home directory so you’re seeing something similar to:
So now you’ll ask the
py
command to give your script to Python by typing py
followed by your script path:How do I make Python scripts executable?
On Windows, the standard Python installer already associates the .py extension with a file type (Python.File) and gives that file type an open command that runs the interpreter (
D:\Program Files\Python\python.exe "%1"%*
). This is enough to make scripts executable from the command prompt as ‘foo.py’. If you’d rather be able to execute the script by simple typing ‘foo’ with no extension you need to add .py to the PATHEXT environment variable.Why does Python sometimes take so long to start?
Usually Python starts very quickly on Windows, but occasionally there are bug reports that Python suddenly begins to take a long time to start up. This is made even more puzzling because Python will work fine on other Windows systems which appear to be configured identically.
The problem may be caused by a misconfiguration of virus checking software on the problem machine. Some virus scanners have been known to introduce startup overhead of two orders of magnitude when the scanner is configured to monitor all reads from the filesystem. Try checking the configuration of virus scanning software on your systems to ensure that they are indeed configured identically. McAfee, when configured to scan all file system read activity, is a particular offender.
Is a *.pyd
file the same as a DLL?
Yes, .pyd files are dll’s, but there are a few differences. If you have a DLL named
foo.pyd
, then it must have a function PyInit_foo()
. You can then write Python “import foo”, and Python will search for foo.pyd (as well as foo.py, foo.pyc) and if it finds it, will attempt to call PyInit_foo()
to initialize it. You do not link your .exe with foo.lib, as that would cause Windows to require the DLL to be present.
Note that the search path for foo.pyd is PYTHONPATH, not the same as the path that Windows uses to search for foo.dll. Also, foo.pyd need not be present to run your program, whereas if you linked your program with a dll, the dll is required. Of course, foo.pyd is required if you want to say
import foo
. In a DLL, linkage is declared in the source code with __declspec(dllexport)
. In a .pyd, linkage is defined in a list of available functions.How can I embed Python into a Windows application?
Embedding the Python interpreter in a Windows app can be summarized as follows:
- Do _not_ build Python into your .exe file directly. On Windows, Python must be a DLL to handle importing modules that are themselves DLL’s. (This is the first key undocumented fact.) Instead, link to
pythonNN.dll
; it is typically installed inC:\Windows\System
. NN is the Python version, a number such as “33” for Python 3.3.You can link to Python in two different ways. Load-time linking means linking againstpythonNN.lib
, while run-time linking means linking againstpythonNN.dll
. (General note:pythonNN.lib
is the so-called “import lib” corresponding topythonNN.dll
. It merely defines symbols for the linker.)Run-time linking greatly simplifies link options; everything happens at run time. Your code must loadpythonNN.dll
using the WindowsLoadLibraryEx()
routine. The code must also use access routines and data inpythonNN.dll
(that is, Python’s C API’s) using pointers obtained by the WindowsGetProcAddress()
routine. Macros can make using these pointers transparent to any C code that calls routines in Python’s C API.Borland note: convertpythonNN.lib
to OMF format using Coff2Omf.exe first. - If you use SWIG, it is easy to create a Python “extension module” that will make the app’s data and methods available to Python. SWIG will handle just about all the grungy details for you. The result is C code that you link into your .exe file (!) You do _not_ have to create a DLL file, and this also simplifies linking.
- SWIG will create an init function (a C function) whose name depends on the name of the extension module. For example, if the name of the module is leo, the init function will be called initleo(). If you use SWIG shadow classes, as you should, the init function will be called initleoc(). This initializes a mostly hidden helper class used by the shadow class.The reason you can link the C code in step 2 into your .exe file is that calling the initialization function is equivalent to importing the module into Python! (This is the second key undocumented fact.)
- In short, you can use the following code to initialize the Python interpreter with your extension module.
- There are two problems with Python’s C API which will become apparent if you use a compiler other than MSVC, the compiler used to build pythonNN.dll.Problem 1: The so-called “Very High Level” functions that take FILE * arguments will not work in a multi-compiler environment because each compiler’s notion of a struct FILE will be different. From an implementation standpoint these are very _low_ level functions.Problem 2: SWIG generates the following code when generating wrappers to void functions:Alas, Py_None is a macro that expands to a reference to a complex data structure called _Py_NoneStruct inside pythonNN.dll. Again, this code will fail in a mult-compiler environment. Replace such code by:It may be possible to use SWIG’s
%typemap
command to make the change automatically, though I have not been able to get this to work (I’m a complete SWIG newbie). - Using a Python shell script to put up a Python interpreter window from inside your Windows app is not a good idea; the resulting window will be independent of your app’s windowing system. Rather, you (or the wxPythonWindow class) should create a “native” interpreter window. It is easy to connect that window to the Python interpreter. You can redirect Python’s i/o to _any_ object that supports read and write, so all you need is a Python object (defined in your extension module) that contains read() and write() methods.
How do I keep editors from inserting tabs into my Python source?
The FAQ does not recommend using tabs, and the Python style guide, PEP 8, recommends 4 spaces for distributed Python code; this is also the Emacs python-mode default.
Under any editor, mixing tabs and spaces is a bad idea. MSVC is no different in this respect, and is easily configured to use spaces: Take
, and for file type “Default” set “Tab size” and “Indent size” to 4, and select the “Insert spaces” radio button.
Python raises
IndentationError
or TabError
if mixed tabs and spaces are causing problems in leading whitespace. You may also run the tabnanny
module to check a directory tree in batch mode.How do I check for a keypress without blocking?
Use the msvcrt module. This is a standard Windows-specific extension module. It defines a function
kbhit()
which checks whether a keyboard hit is present, and getch()
which gets one character without echoing it
-
Không có nhận xét nào:
Đăng nhận xét