introduction to python, com and pythoncom mark hammond skippi-net, melbourne, australia...
TRANSCRIPT
Introduction to Python, COM and PythonCOM
Mark Hammond
Skippi-Net, Melbourne, Australia
[email protected]://starship.python.net/crew/mhammond
Introduction to Python, COM and PythonCOM
The plan• Section I - Intro to Python
– 30-45 mins
• Section II - Intro to COM– 30-45 mins
• Python and COM– The rest!
Section I
Introduction to Python
What Is Python?
• Created in 1990 by Guido van Rossum– While at CWI, Amsterdam– Now hosted by centre for national research
initiatives, Reston, VA, USA
• Free, open source– And with an amazing community
• Object oriented language– “Everything is an object”
Why Python?
• Designed to be easy to learn and master– Clean, clear syntax– Very few keywords
• Highly portable– Runs almost anywhere - high end servers and
workstations, down to windows CE– Compiled to machine independent byte-codes
• Extensible– Designed to be extensible using C/C++, thereby allowing
access to many external libraries
Most obvious and notorious features
• Clean syntax plus high-level data types– Leads to fast coding
• Uses white-space to delimit blocks– Humans generally do, so why not the language?– Try it, you will end up liking it
• Variables do not need declaration– Although not a type-less language
Pythonwin
• We are using Pythonwin– Only available on Windows– GUI toolkit using Tkinter available for most
platforms– Standard console Python available on all platforms
• Has interactive mode for quick testing of code
• Includes debugger and Python editor
Interactive Python
• Starting Python.exe, or any of the GUI environments present an interactive mode
–>>> prompt indicates start of a statement or expression
– If incomplete, ... prompt indicates second and subsequent lines
– All expression results printed back to interactive console
Variables and Types (1 of 3)
• Variables need no declaration• >>> a=1>>>
• As a variable assignment is a statement, there is no printed result
• >>> a1
• Variable name alone is an expression, so the result is printed
Variables and Types (2 of 3)
• Variables must be created before they can be used
• >>> bTraceback (innermost last): File "<interactive input>", line 1, in ?NameError: b>>>
• Python uses exceptions - more detail later
Variables and Types (3 of 3)
• Objects always have a type• >>> a = 1>>> type(a)<type 'int'> >>> a = "Hello">>> type(a)<type 'string'>>>> type(1.0)<type 'float'>
Assignment versus Equality Testing
• Assignment performed with single =• Equality testing done with double = (==)
– Sensible type promotions are defined– Identity tested with is operator.
• >>> 1==11>>> 1.0==11>>> "1"==10
Simple Data Types
• Strings– May hold any data, including embedded NULLs– Declared using either single, double, or triple
quotes– >>> s = "Hi there">>> s'Hi there'>>> s = "Embedded 'quote'">>> s"Embedded 'quote'"
Simple Data Types
– Triple quotes useful for multi-line strings– >>> s = """ a long... string with "quotes" or anything else""">>> s' a long\012string with "quotes" or anything else' >>> len(s)45
Simple Data Types• Integer objects implemented using C longs
– Like C, integer division returns the floor– >>> 5/22
• Float types implemented using C doubles
• Long Integers have unlimited size– Limited only by available memory– >>> long = 1L << 64>>> long ** 52135987035920910082395021706169552114602704522356652769947041607822219725780640550022962086936576L
High Level Data Types
• Lists hold a sequence of items– May hold any object– Declared using square brackets
• >>> l = []# An empty list>>> l.append(1)>>> l.append("Hi there")>>> len(l)2
High Level Data Types
• >>> l[1, 'Hi there']>>>>>> l = ["Hi there", 1, 2]>>> l['Hi there', 1, 2]>>> l.sort()>>> l[1, 2, 'Hi there']
High Level Data Types
• Tuples are similar to lists– Sequence of items
– Key difference is they are immutable
– Often used in place of simple structures
• Automatic unpacking• >>> point = 2,3>>> x, y = point>>> x2
High Level Data Types
• Tuples are particularly useful to return multiple values from a function
• >>> x, y = GetPoint()• As Python has no concept of byref
parameters, this technique is used widely
High Level Data Types
• Dictionaries hold key-value pairs– Often called maps or hashes. Implemented using
hash-tables– Keys may be any immutable object, values may
be any object– Declared using braces
• >>> d={}>>> d[0] = "Hi there">>> d["foo"] = 1
High Level Data Types
• Dictionaries (cont.)• >>> len(d)2>>> d[0]'Hi there'>>> d = {0 : "Hi there", 1 : "Hello"}>>> len(d)2
Blocks
• Blocks are delimited by indentation– Colon used to start a block– Tabs or spaces may be used– Maxing tabs and spaces works, but is discouraged
• >>> if 1:... print "True"... True>>>
Blocks
• Many people hate this when they first see it– Almost all Python programmers come to love it
• Humans use indentation when reading code to determine block structure– Ever been bitten by the C code?:
• if (1) printf("True"); CallSomething();
Looping
• The for statement loops over sequences• >>> for ch in "Hello":... print ch... Hello>>>
Looping
• Built-in function range() used to build sequences of integers
• >>> for i in range(3):... print i... 012>>>
Looping
• while statement for more traditional loops• >>> i = 0>>> while i < 3:... print i... i = i + 1... 012>>>
Functions
• Functions are defined with the def statement:
• >>> def foo(bar):... return bar>>>
• This defines a trivial function named foo that takes a single parameter bar
Functions
• A function definition simply places a function object in the namespace
• >>> foo<function foo at fac680>>>>
• And the function object can obviously be called:• >>> foo(3)3>>>
Classes
• Classes are defined using the class statement
• >>> class Foo:... def __init__(self):... self.member = 1... def GetMember(self):... return self.member... >>>
Classes
• A few things are worth pointing out in the previous example:– The constructor has a special name __init__,
while a destructor (not shown) uses __del__– The self parameter is the instance (ie, the this
in C++). In Python, the self parameter is explicit (c.f. C++, where it is implicit)
– The name self is not required - simply a convention
Classes
• Like functions, a class statement simply adds a class object to the namespace
• >>> Foo<class __main__.Foo at 1000960>>>>
• Classes are instantiated using call syntax• >>> f=Foo()>>> f.GetMember()1
Modules
• Most of Python’s power comes from modules
• Modules can be implemented either in Python, or in C/C++
• import statement makes a module available• >>> import string>>> string.join( ["Hi", "there"] )'Hi there'>>>
Exceptions
• Python uses exceptions for errors– try / except block can handle exceptions
• >>> try:... 1/0... except ZeroDivisionError:... print "Eeek"... Eeek>>>
Exceptions
• try / finally block can guarantee execute of code even in the face of exceptions
• >>> try:... 1/0... finally:... print "Doing this anyway"... Doing this anywayTraceback (innermost last): File "<interactive input>", line 2, in ?ZeroDivisionError: integer division or modulo
>>>
Threads
• Number of ways to implement threads• Highest level interface modelled after Java• >>> class DemoThread(threading.Thread):... def run(self):... for i in range(3):... time.sleep(3)... print i... >>> t = DemoThread()>>> t.start()>>> t.join()01 <etc>
Standard Library
• Python comes standard with a set of modules, known as the “standard library”
• Incredibly rich and diverse functionality available from the standard library– All common internet protocols, sockets, CGI, OS
services, GUI services (via Tcl/Tk), database, Berkeley style databases, calendar, Python parser, file globbing/searching, debugger, profiler, threading and synchronisation, persistency, etc
External library
• Many modules are available externally covering almost every piece of functionality you could ever desire– Imaging, numerical analysis, OS specific
functionality, SQL databases, Fortran interfaces, XML, Corba, COM, Win32 API, etc
• Way too many to give the list any justice
Python Programs
• Python programs and modules are written as text files with traditionally a .py extension
• Each Python module has its own discrete namespace
• Python modules and programs are differentiated only by the way they are called– .py files executed directly are programs (often referred
to as scripts)– .py files referenced via the import statement are
modules
Python Programs
• Thus, the same .py file can be a program/script, or a module
• This feature is often used to provide regression tests for modules– When module is executed as a program, the
regression test is executed– When module is imported, test functionality is not
executed
Python “Protocols”
• Objects can support a number of different “protocols”
• This allows your objects to be treated as:– a sequence - ie, indexed or iterated over– A mapping - obtain/assign keys or values– A number - perform arithmetic– A container - perform dynamic attribute fetching
and setting– Callable - allow your object to be “called”
Python “Protocols”
• Sequence and container example• >>> class Protocols:... def __getitem__(self, index):... return index ** 2... def __getattr__(self, attr):... return "A big " + attr... >>> p=Protocols()>>> p[3]9>>> p.Foo'A big Foo’>>>
More Information on Python
• Can’t do Python justice in this short time frame– But hopefully have given you a taste of the
language
• Comes with extensive documentation, including an excellent tutorial and library reference– Also a number of Python books available
• Visit www.python.org for more details
Section II
Introduction to COM
What is COM?
• Acronym for Component Object Model, a technology defined and implemented by Microsoft
• Allows “objects” to be shared among many applications, without applications knowing the implementation details of the objects
• A broad and complex technology
• We can only provide a brief overview here
What was COM
• COM can trace its lineage back to DDE
• DDE was expanded to Object Linking and Embedding (OLE)
• VBX (Visual Basic Extensions) enhanced OLE technology for visual components
• COM was finally derived as a general purpose mechanism– Initially known as OLE2
COM Interfaces
• COM relies heavily on interfaces
• An interface defines functionality, but not implementation– Each object (or more correctly, each class) defines
implementation of the interface
– Each implementation must conform to the interface
• COM defines many interfaces– But often does not provide implementation of these
interfaces
COM Interfaces
• Interfaces do not support properties – We will see how COM properties are typically
defined later
• Interfaces are defined using a vtable scheme similar to how C++ defines virtual methods
• All interfaces have a unique ID (an IID)– Uses a Universally Unique Identifer (UUID)– UUIDs used for many COM IDs, including IIDs
Classes
• COM defines the concept of a class, used to create objects– Conceptually identical to a C++ or Python class– To create an object, COM locates the class factory, and
asks it to create an instance
• Classes have two identifiers– Class ID (CLSID) is a UUID, so looks similar to an
IID– ProgID is a friendly string, and therefore not
guaranteed unique
IUnknown interface
• Base of all COM interfaces– By definition, all interfaces also support the
IUnknown interface
• Contains only three methods– AddRef() and Release() for managing COM
lifetimes• COM lifetimes are based on reference counts
– QueryInterface() for obtaining a new interface from the object
Creating objects, and obtaining interfaces
• To create an object, the programmer specifies either the ProgID, or the CLSID
• This process always returns the requested interface– or fails!
• New interfaces are obtained by using the IUnknown::QueryInterface() method– As each interface derives from IUnknown, each
interface must support QI
Other standard interfaces
• COM defines many interfaces, for example:• IStream
– Defines file like operations
• IStorage– Defines file system like semantics
• IPropertyPage– Defines how a control exposes a property page
• etc - many many interfaces are defined– But not many have implementations
Custom interfaces
• COM allows you to define your own interfaces
• Interfaces are defined using an Interface Definition Language (IDL)
• Tools available to assign unique IIDs for the interface
• Any object can then implement or consume these interfaces
IDispatch - Automation objects
• IDispatch interface is used to expose dynamic object models
• Designed explicitly for scripting languages, or for those languages that can not use normal COM interfaces– eg, where the interface is not known at compile time, or
there is no compile time at all
• IDispatch is used extensively– Microsoft Office, Netscape, Outlook, VB, etc - almost
anything designed to be scripted
IDispatch - Automation objects
• Methods and properties of the object model can be determined at runtime– Concept of Type Libraries, where the object
model can be exposed at compile time
• Methods and properties are used indirectly– GetIDsOfNames() method is used to get an ID for
a method or property– Invoke() is used to make the call
IDispatch - Automation objects
• Languages usually hide these implementation details from the programmer
• Example: object.SomeCall()– Behind the scenes, your language will:id = GetIDsOfNames("SomeCall")Invoke(id, DISPATCH_METHOD)
• Example: object.SomeProp– id = GetIDsOfNames("SomeProp")Invoke(id, DISPATCH_PROP_GET)
VARIANTs
• The IDispatch interface uses VARIANTs as its primary data type
• Simply a C union that supports the common data types– and many helper functions for conversion etc
• Allows the single Invoke() call to accept almost any data type
• Many languages hide these details, performing automatic conversion as necessary
Implementation models
• Objects can be implemented in a number of ways– InProc objects are implemented as DLLs, and loaded
into the calling process• Best performance, as no marshalling is required
– LocalServer/RemoteServer objects are implemented as stand-alone executables
• Safer due to process isolation, but slower due to remoting
• Can be both, and caller can decide, or let COM choose the best
Distributed COM
• DCOM allows objects to be remote from their caller
• DCOM handles all marshalling across machines and necessary security
• Configuration tools allow an administrator to configure objects so that neither the object nor the caller need any changes– Although code changes can be used to explicitly
control the source of objects
The Windows Registry
• Information on objects stored in the Windows registry– ProgID to CLSID mapping– Name of DLL for InProc objects– Name of EXE for LocalServer objects– Other misc details such as threading models
• Lots of other information also maintained in registry– Remoting proxies, object security, etc.
Conclusion for Section II
• COM is a complex and broad beast
• Underlying it all is a fairly simple Interface model– Although all the bits around the edges combine to
make it very complex
• Hopefully we have given enough framework to put the final section into some context
Section III
Python and COM
PythonCOM architecture
• Underpinning everything is an extension module written in C++ that provides the core COM integration– Support for native COM interfaces exist in this
module– Python reference counting is married with COM
reference counting
• Number of Python implemented modules that provide helpers for this core module
Interfaces supported by PythonCOM
• Over 40 standard interfaces supported by the framework
• Extension architecture where additional modules can add support for their own interfaces– Tools supplied to automate this process– Number of extension modules supplied, bringing
total interfaces to over 100
Using Automation from Python
• Automation uses IDispatch to determine object model at runtime
• Python function win32com.client.Dispatch() provides this run-time facility
• Allows a native “look and feel” to these objects
Using Automation - Example
• We will use Excel for this demonstration
• Excel ProgId is Excel.Application• >>> from win32com.client import Dispatch>>> xl=Dispatch("Excel.Application")>>> xl<COMObject Excel.Application>
>>>
Using Automation - Example
• Now that we have an Excel object, we can call methods and set properties
• But we can’t see Excel running?– It has a visible property that may explain things
• >>> xl.Visible0
• Excel is not visible, let’s make it visible>>> xl.Visible = 1>>>
Automation - Late vs. Early Bound
• In the example we just saw, we have been using late bound COM– Python has no idea what properties or methods
are available– As we attempt a method or property access,
Python dynamically asks the object– Slight performance penalty, as we must resolve
the name to an ID (GetIDsOfNames()) before calling Invoke()
Automation - Late vs. Early Bound
• If an object provides type information via a Type Library, Python can use early bound COM
• Implemented by generating a Python source file with all method and property definitions– Slight performance increase as all names have
been resolved to IDs at generation time, rather than run-time
Automation - Late vs. Early Bound
• Key differences between the 2 techniques:– Late bound COM often does not know the specific
types of the parameters• Type of the Python object determines the VARIANT type
created• Does not know about ByRef parameters, so no parameters
are presented as ByRef
– Early bound COM knows the types of the parameters• All Python types are coerced to the correct type• ByRef parameters work (returned as tuples)
Playing with Excel
• >>> xl.Workbooks.Add()<COMObject <unknown>>>>> xl.Range("A1:C1").Value = "Hi", "From", "Python">>> xl.Range("A1:C1").Value((L'Hi', L'From', L'Python'),)>>> xl.Range("A1:C1").PrintOut()>>>
How did we know the methods?
• Indeed, how did we know to use “Excel.Application”?
• No easy answer - each application/object defines their own object model and ProgID
• Documentation is the best answer– Note that MSOffice does not install the COM
documentation by default - must explicitly select it during setup
• COM browsers can also help
Native Interfaces from Python
• Examples so far have been using using IDispatch– win32com.client.Dispatch() function
hides the gory details from us
• Now an example of using interfaces natively– Need a simple example - Windows Shortcuts fits
the bill
• Develop some code that shows information about a Windows shortcut
Native Interfaces from Python
• 4 main steps in this process– Import the necessary Python modules– Obtain an object that implements the IShellLink
interface– Obtain an IPersistFile interface from the object,
and load the shortcut– Use the IShellLink interface to get information
about the shortcut
Native Interfaces from PythonStep 1: Import the necessary Python modules
– We need the pythoncom module for core COM support
– We need the win32com.shell.shell module
• This is a PythonCOM extension that exposes additional interfaces
• >>> import pythoncom>>> from win32com.shell import shell
Native Interfaces from PythonStep 2: Obtain an object that implements the
IShellLink interface
• Use the COM function CoCreateInstance()
• Use the published CLSID for the shell
• The shell requires that the object request be for an InProc object.
• Request the IShellLink interface
Native Interfaces from PythonStep 2: Obtain an object that implements the
IShellLink interface• >>> sh =
pythoncom.CoCreateInstance(shell.CLSID_ShellLink,
... None,
... pythoncom.CLSCTX_INPROC_SERVER,
... shell.IID_IShellLink)>>> sh<PyIShellLink at 0x1630b04 with obj at 0x14c9d8>>>>
Native Interfaces from PythonStep 3: Obtain an IPersist interface from the
object, and load the shortcut• Use QueryInterface to obtain the new interface
from the object• >>> pe=
sh.QueryInterface(pythoncom.IID_IPersistFile)>>> pe.Load("c:/winnt/profiles/skip/desktop/IDLE.lnk")
Native Interfaces from PythonStep 4: Use the IShellLink interface to get
information about the shortcut
• >>> sh.GetWorkingDirectory()'L:\\src\\python-cvs\\tools\\idle'>>> sh.GetArguments()'l:\\src\\python-cvs\\tools\\idle\\idle.pyw'
>>>
Implementing COM using Python.
• Final part of our tutorial is implementing COM objects using Python
• 3 main steps we must perform– Implement a Python class that exposes the
functionality– Annotate the class with special attributes required
by the COM framework.– Register our COM object
Implementing COM using Python.Step 1: Implement a Python class that exposes
the functionality
• We will build on our last example, by providing a COM object that gets information about a Windows shortcut
• Useful to use from VB, as it does not have direct access to these interfaces.
• We will design a class that can be initialised to a shortcut, and provides methods for obtaining the info.
Implementing COM using PythonStep 1: Implement a Python class that exposes
the functionality
• Code is getting to big for the slides
• Code is basically identical to that presented before, except:– The code has moved into a class.– We store the IShellInfo interface as an instance
attribute
Implementing COM using PythonStep 1: Implement a Python class that exposes
the functionality
• Provide an Init method that takes the name of the shortcut.
• Also provide two trivial methods that simply delegate to the IShellInfo interface
• Note Python would allow us to automate the delegation, but that is beyond the scope of this!
Implementing COM using PythonStep 2: Annotate the class with special
attributes• PythonCOM requires a number of special
attributes– The CLSID of the object (a UUID)
• Generate using print pythoncom.CreateGuid()
– The ProgID of the object (a friendly string)• Make one up!
– The list of public methods• All methods not listed as public are not exposed via
COM
Implementing COM using PythonStep 2: Annotate the class with special
attributes
• class PyShellLink: _public_methods_=['Init', 'GetWorkingDirectory', 'GetArguments'] _reg_clsid_="{58DE4632-9323-11D3-85F9-00C04FEFD0A7}" _reg_progid_="Python.ShellDemo" ...
Implementing COM using PythonStep 3: Register our COM object
• General technique is that the object is registered when the module implementing the COM object is run as a script.– Recall previous discussion on modules versus
scripts
• Code is trivial– Call the UseCommandLine() method, passing
the class objects we wish to register
Implementing COM using PythonStep 3: Register our COM object
• Code is simple– note we are passing the class object, not a class
instance.• if __name__=='__main__': UseCommandLine(PyShellLink)
• Running this script yields:• Registered: Python.ShellDemo
Testing our COM object
• Our COM object can be used by any automation capable language– VB, Delphi, Perl - even VC at a pinch!
• We will test with a simple VBScript program– VBScript is free. Could use full blown VB, or
VBA. Syntax is identical in all cases
Testing our COM object
• set shdemo = CreateObject("Python.ShellDemo")
shdemo.Init("c:\winnt\profiles\skip\desktop\IDLE.lnk")
WScript.Echo "Working dir is " + shdemo.GetWorkingDirectory()
• Yields• Working dir is L:\src\python-cvs\tools\idle
Summary
• Python can make COM simple to work with– No reference count management– Many interfaces supported.– Natural use for IDispatch (automation) objects.
• Simple to use COM objects, or implement COM objects
Questions?
More Information
• Python itself– http://www.python.org– news:comp.lang.python
• Python on Windows– http://www.python.org/windows
• Mark Hammond’s Python extensions– http://starship.python.net/crew/mhammond– http://www.python.org/windows