Jan 29 14:04:40 the speaker is David Malcom, packager of python for Fedora
Jan 29 14:05:11 he is planning to talk about different species of python (jython, cython etc.)
Jan 29 14:05:35 seems to be a technical audience (experience with python, Java, fedora packaging etc)
Jan 29 14:05:46 it probably would be
Jan 29 14:05:52 pypy etc
Jan 29 14:06:09 * attempting to get his laptop to behave and show the slides properly
Jan 29 14:06:25 So why do we care about the different species of python?
Jan 29 14:06:59 * interruption for request to transcribe the presentation
Jan 29 14:07:27 * bcl (~bcl@neil.brianlane.com) has joined #fudcon-room-3
Jan 29 14:07:31 * interruption completed
Jan 29 14:07:49 slides will be uploaded to David Malcom's fedora people page once he has internet access
Jan 29 14:07:57 so why do we care about different species of python?
Jan 29 14:08:17 intellectually interested in different implementatations, different strengths/weaknesses
Jan 29 14:08:25 memory usage, debugging ability, etc/.
Jan 29 14:08:53 also interacting with other technologies (ie jython for interacting with java)
Jan 29 14:09:18 Doesn't assert that there is a single best implementation of python - they all have their strengths and best places
Jan 29 14:09:23 so what is python for?
Jan 29 14:09:30 - one off scripts
Jan 29 14:09:37 tflink: thanks
Jan 29 14:09:44 - simple hacks that can be changed into something long-term
Jan 29 14:10:05 - highly readable high-level language
Jan 29 14:10:18 - Python is "Batteries Included"
Jan 29 14:10:33 * feel free to ask questions (even remote) - I will try to relay
Jan 29 14:10:54 Python can also be used as glue code for bridging libraries with high level code
Jan 29 14:11:08 sometimes, the linux community is too independant - won't accept a common runtime
Jan 29 14:11:23 * jsmith-mobile (95a98657@gateway/web/freenode/ip.149.169.134.87) has joined #fudcon-room-3
Jan 29 14:11:42 python can be used as something as a "common runtime" (as much as anything)
Jan 29 14:12:07 Since python can be easily plugged into c++, easy to use with gdb
Jan 29 14:12:14 easy to bind to C libs is a strength
Jan 29 14:12:23 So where is python used in Fedora?
Jan 29 14:12:37 * rdieter (~foo@fedora/rdieter) has joined #fudcon-room-3
Jan 29 14:12:46 powers *.fedoraproject.org
Jan 29 14:13:08 also used by TurboGears, Django, other apps (koji et. al.)
Jan 29 14:13:23 Fedora infrastructure does use some Django, but it is minimal
Jan 29 14:13:46 So we have all these possible uses of python (glue code, web development, simple scripts ...)
Jan 29 14:14:05 -> "Python" vs "CPython"
Jan 29 14:14:10 Python -> language
Jan 29 14:14:31 CPython -> what most people think of as python (generally /usr/bin/python" and the original implementation
Jan 29 14:14:42 * DiscordianUK nods
Jan 29 14:15:37 * missed the bullets on slide about kloc in sections of CPython
Jan 29 14:15:55 CPython's object system
Jan 29 14:15:58 many klocs I'm sure
Jan 29 14:16:24 Cpython is a implementation is C and has objects and types hand-coded in C
Jan 29 14:16:34 Objects are .c structs with a ref count
Jan 29 14:16:57 references between objects are just .c pointers -> objects can't move around in memory
Jan 29 14:17:13 * Cerlyn (~Cerlyn@66.87.11.113) has joined #fudcon-room-3
Jan 29 14:17:51 there is one big mutex in python (for counting references, if I heard correctly)
Jan 29 14:18:05 * question about a patch by google to remove that mutex
Jan 29 14:18:13 The Global Interpreter Lock (GIL)
Jan 29 14:18:26 there was an attempt to remove the mutex in the past (0.99 era?) but it failed
Jan 29 14:18:58 * transcribers note - sorry, I'm having a bit of trouble keeping up. missing a little bit
Jan 29 14:19:23 thanks for what you are doing
Jan 29 14:19:25 the other issue with CPython is reference counting
Jan 29 14:19:54 these pointers are being passed around by hand, and its easy to get wrong
Jan 29 14:20:08 can end up with memory leaks, segfaults and other hard to debug situations
Jan 29 14:20:12 but on the other hand, it is simple
Jan 29 14:20:23 * burriedu2 (95a9ac77@gateway/web/freenode/ip.149.169.172.119) has joined #fudcon-room-3
Jan 29 14:20:32 the next part of CPython is the interpreter
Jan 29 14:20:51 python compiles the code down to bytecode which is a series of simple operations
Jan 29 14:21:35 tha.py files are turned into a syntax tree that are turned into instructions that are on the "Fake" CPU and some operations are collapsed into just data
Jan 29 14:21:54 example: if using the len() function, it is possible to redefine that
Jan 29 14:22:19 also possible to redefine stuff like true and false, so its hard to do traditional optimizations at compile time
Jan 29 14:22:36 * question - is byte code consistant between implementations?
Jan 29 14:22:46 no, the bytecode is not consistant between the implementations
Jan 29 14:23:20 that's a fail then
Jan 29 14:23:25 there is a marker in the .pyc that identifies the version of the bytecode generated and a timestamp of the associated .py file
Jan 29 14:23:54 so when you compile a .py file to .pyc, its kind of like a make file
Jan 29 14:24:12 when you run a .py, the bytecode has to exactly match the runtime, or else it will be recompiled
Jan 29 14:24:49 but the bytecode generally stays consistent between updates of the same version (ex. python2.7 versions all have the same bytecode "magic number")
Jan 29 14:25:02 but the bytecode number could change between development versions
Jan 29 14:25:30 * missed the question
Jan 29 14:25:44 ahhh so the bytecode is consistent across OSes?
Jan 29 14:26:03 the problem is that the .pyc files were generally living next to the .py files, but there is a proposal to change that
Jan 29 14:26:17 the new proposal would have a separate directory for bytecode
Jan 29 14:26:26 in a .pycache directory
Jan 29 14:26:38 that would have a dir for each bytecode version
Jan 29 14:27:13 DiscourdianUK: the bytecode should be consistent across OSs
Jan 29 14:27:20 the important variable is the runtime
Jan 29 14:28:06 so the opcodes (and the byte code) will change between pypy, CPython, Jython, IronPython etc.
Jan 29 14:28:16 but should stay the same for all versions of CPython 2.6 etc.
Jan 29 14:28:36 * quesion - are the python version reverse compatible (can you run 2.3 bytecode on a 2.6 interperter)
Jan 29 14:28:53 no, you can't do that because they may have removed opcodes or added opcodes
Jan 29 14:29:07 even the functions could have changed between versions
Jan 29 14:29:19 * djf_jeff (~jeff@184-106-95-233.static.cloud-ips.com) has joined #fudcon-room-3
Jan 29 14:29:32 ie there are symantic differences between the different versions of python
Jan 29 14:30:21 * example on screen - not sure that I can type fast enough
Jan 29 14:30:52 there will hopefully be slides
Jan 29 14:31:07 talking about decrememnting the reference count inside a while loop and some of the potential problems in CPython implementation
Jan 29 14:31:18 DiscourdianUK: he said he would post them when he gets internet access
Jan 29 14:31:34 The good parts of CPython:
Jan 29 14:31:38 main loop is a giant switch() statement to process the .pyc opcodes
Jan 29 14:31:49 easy to bind to C code
Jan 29 14:31:56 (just please do it correctly)
Jan 29 14:32:22 you can wrap other C code with "python like" data types to be able to include it into python code
Jan 29 14:32:43 it is a rather simple implementation, in the grand scheme of things
Jan 29 14:32:49 the bad parts of CPython:
Jan 29 14:33:14 it is a bit slow since you're always interpreting the bytecode - never going to be as fast as machine code
Jan 29 14:33:32 since the language is so dynamic, you can't use a lot of the traditional optimizations for compile time
Jan 29 14:34:10 The Global Interpreter Lock is another disadvantage
Jan 29 14:34:25 * question - what about google's unladen swallow?
Jan 29 14:34:33 that was a project to add a JIT to CPython
Jan 29 14:35:09 they tried to take LLVM (low level virtual machine) which is a library that implements a lot of the things that a compiler could use
Jan 29 14:35:36 so you could construct fragements of code and say "give me machine code"
Jan 29 14:36:00 when a python code is being called 1000 with the same int valued
Jan 29 14:36:26 so the JIT would make machine code instead of a big switch statement
Jan 29 14:36:34 the hope was that it would provide a HUGE speedup
Jan 29 14:36:42 unfortunately, it was only about 20% speedup
Jan 29 14:37:10 since you are generating all that code at runtime, you are doing a LOT of checks (~5 conditionals before adding 2 ints)
Jan 29 14:37:24 in theory, you could optimize all of that away with clever coding
Jan 29 14:37:45 but the last word on the mailing list was that all the people at google who were working on this have moved on to other projects
Jan 29 14:38:01 at this time, there doesn't seem to be any people primed to take over the unladen swallow project
Jan 29 14:38:03 moving on ...
Jan 29 14:38:23 reference counting fun -> it is too easy to et it wrong and cause crash or other problems
Jan 29 14:38:36 Objects can't move around in memory and this can fragment the heap
Jan 29 14:39:06 lots of references twiddling-> impossible to have readonly data in shared memory pages (ie KBM's KSM)
Jan 29 14:39:22 THere is a non-opaque object API
Jan 29 14:39:30 the implementation of details are visable to C extensions
Jan 29 14:39:45 this makes them hard to change without breaking hundreds of extentions
Jan 29 14:39:57 example: strings are merely string + length
Jan 29 14:40:13 * notes that there isn't much time left
Jan 29 14:40:23 if you're going to do extensions, please use Cython
Jan 29 14:40:36 it auto-generates .c code, handling alot of the details
Jan 29 14:40:50 PLEASE don't use SWIG (this will probably be contravertial)
Jan 29 14:41:05 all you get are python objects taht wrap C and C++ pointers
Jan 29 14:41:12 * going faster now
Jan 29 14:41:18 Debug builds
Jan 29 14:41:30 Complile CPYthon --with-pydebug
Jan 29 14:41:49 adds lots of useful debugging instrumention and makes it easier to debug in gdb
Jan 29 14:41:52 but it is a LOT slower
Jan 29 14:42:33 keep in mind that while the .h files are the same between debug and non-debug builds, the .so files are NOT COMPATIBLE with the regular optimized python (There are ABI differences)
Jan 29 14:42:51 so for example, you couldn't run yum with debug python
Jan 29 14:42:57 -- Python 3
Jan 29 14:43:13 Python 3 is a big rewrite of CPython 2, fixing lots of long-standing problems
Jan 29 14:43:26 there are syntactic differences from Python 2
Jan 29 14:43:44 cnages in the standard lirary, different .pyc files
Jan 29 14:44:04 but it should be much nicer to use than Python 2 (in the speakers opinion)
Jan 29 14:44:17 there is slowly growing 3rd party module support
Jan 29 14:44:34 * question - Is the global lock gone in python 3?
Jan 29 14:44:45 no, it is there still
Jan 29 14:44:58 there are arch. issues that won't be fixed in Python 3
Jan 29 14:45:05 -> alternate Python
Jan 29 14:45:08 Jythhon:
Jan 29 14:45:24 Java base class: org.python.core.PyObject
Jan 29 14:45:38 Can wrap arbitrary objects in python
Jan 29 14:45:45 so what is the runtime of Jython?
Jan 29 14:45:50 it's java
Jan 29 14:45:59 * badkittydaddy (95a9941d@gateway/web/freenode/ip.149.169.148.29) has joined #fudcon-room-3
Jan 29 14:46:05 the .py files are compiled to syntax treed, then converted directly into java bytecode
Jan 29 14:46:26 in theory, it should be fast (JIT-compiled machine code)
Jan 29 14:46:53 however, much of the time, Java bytecode is calling back into the code PyObject code which has to implement some messy switch statements
Jan 29 14:47:08 * q - doesn't java do a lot of the same things?
Jan 29 14:47:36 kind of, but java is a lot more static and there is some hacking in order to bridge the two worlds
Jan 29 14:47:45 so what are the advantages off jython?
Jan 29 14:47:57 you can embed it inside a java appserver
Jan 29 14:48:17 you can use the java garbage collecter (the CPython GC is not very good)
Jan 29 14:48:33 the java VM is perhaps the best open source runtime that we have:
Jan 29 14:48:52 JIT, GC, years of research and competition, no GIL
Jan 29 14:49:21 * missed question
Jan 29 14:49:58 the GIL tends to be an issue when you're trying to get max performance
Jan 29 14:50:13 for example, some of yum's perf issues come from talking to disk
Jan 29 14:50:23 you want it to have a simple interface, though
Jan 29 14:50:36 but when you are working on a script, how fast does it really have to be?
Jan 29 14:50:50 the performance issues are more prevalent in webserver space
Jan 29 14:50:59 GIL hampers multiprocessing
Jan 29 14:51:09 -> back to Jython
Jan 29 14:51:26 you can also use Java DB bindings in python code
Jan 29 14:51:31 but Jython is still at 2.6
Jan 29 14:51:39 ==> IronPython
Jan 29 14:51:53 IP is similar to Jython but on top of the CLR (.NET Runtime)
Jan 29 14:52:06 question was about usinthe the multiprocessing module instead of threads. It breaks them out into subprocesses instead of threads so multicores can be taken advantage of.
Jan 29 14:52:14 apparantly it works on Mono (not sure who is working on this)
Jan 29 14:52:39 * jdob (95a97df7@gateway/web/freenode/ip.149.169.125.247) has joined #fudcon-room-3
Jan 29 14:52:48 *q didn't microsoft fire the person who was working on IP (not sure if I'm right on this - transcriber)
Jan 29 14:52:55 ... moving on to PyPy
Jan 29 14:53:10 PyPy is very different from the others talked about
Jan 29 14:53:28 it is an implementation of an interperter for the FULL python language (with JIT compilation)
Jan 29 14:53:37 it is written in a high level language
Jan 29 14:54:02 the implementation language is compiled down to .c code from which we get binary
Jan 29 14:54:11 can also compile to C#, Java etc.
Jan 29 14:54:29 PyPy is actually written in Python (hence PyPy)
Jan 29 14:54:46 * Diagram on slides about how many pythons
Jan 29 14:54:53 * jdob (95a97df7@gateway/web/freenode/ip.149.169.125.247) has left #fudcon-room-3
Jan 29 14:55:21 you are supposed to run python code through the PyPy code, which spits out lots of generated c code
Jan 29 14:55:36 which should have the same behavior as the python code would have running through the interpreter
Jan 29 14:56:03 so you end up with c code that takes a while to translate but it can end up much faster than interpreted python code
Jan 29 14:56:13 PyPy has limitations
Jan 29 14:56:32 so we go through this strange process of translation to allow different optimization
Jan 29 14:56:55 pypy does have .pyc files by default (similar but different from CPython)
Jan 29 14:57:09 it is starting to have support for the CPython extension APIs
Jan 29 14:57:27 it is different, but everytime the speaker has tried them, it has segfaulted, crashed and burned
Jan 29 14:57:39 bcl: sorry, I missed your question, will try after talk
Jan 29 14:57:47 advantages of pypy
Jan 29 14:57:54 speed: see http://speed.pypy.org
Jan 29 14:58:10 it is fast because the object implementations are better and have smarter data structures
Jan 29 14:58:26 JIT: based on tracing itself, interpreting "hot" loops
Jan 29 14:59:02 tflink: I'm sitting behind you :) comment was about the multiprocessing question you missed.
Jan 29 14:59:17 bcl: thanks, just trying to keep up
Jan 29 14:59:45 memory usage should be better, because of smarter data structures
Jan 29 14:59:54 disadvantages of pypy:
Jan 29 15:00:06 currently at python 2.5 (2.7 is on its way)
Jan 29 15:00:16 6 million lines of augenerated .c code
Jan 29 15:00:25 they also only seem to care about the 2 archs
Jan 29 15:00:34 * badkittydaddy has quit (Ping timeout: 265 seconds)
Jan 29 15:00:48 also wanted to go into packaging and other implementation
Jan 29 15:01:00 but he will probably leave that discussion for the mailing list
Jan 29 15:01:10 * out of time
Jan 29 15:01:25 ==> This is the end of the python presentation