The fifth article demonstrates how to write, compile, and execute your first program in 7 different compiled programming languages (C, C++, Fortran, Golang).

The sixth and final article conducts some performance benchmarks of the CPU, Memory, Disk, and Network, in both native Ubuntu on a physical machine, and Ubuntu on Windows running on the same system.

I really enjoyed writing these. Hopefully you'll try some of the examples, and share your experiences using Ubuntu native utilities on a Windows desktop. You can find the source code of the programming examples in Github and Launchpad:

b) Install the Ubuntu monospace font, by opening the zip file you downloaded, finding UbuntuMono-R.ttf, double clicking on it, and then clicking Install.

c) Enable the Ubuntu monospace font for the command console in the Windows registry. Open regedit and find this key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console\TrueTypeFont and add a new string value name "000" with value data "Ubuntu Mono"

Recently a very interesting bug has been reported agains Ubuntu One on Windows. Apparently we try to sync a number of system folders that are present on Windows 7 to be backward compatible.

The problem

The actual problem in the code is that we are using os.listdir. While lisdir on python does return system folders (at the end of the day, they are there) os.walk does not, for example, lets imaging hat we have the following scenario:

Two really good developers, Alecu and Diego, have discovered a very interestning bug in the os.path.expanduser function in Python. If you have a user in your Windows machine with a name hat uses Japanese characters like “??????” you will have the following in your system:

The Windows Shell will show the path correctly, that is: “C:\Users\??????”

cmd.exe will show: “C:\Users\??????”

All the env variables will be wrong, which means they will be similar to the info shown in cmd.exe

The above is clearly a problem, specially when the implementation of os.path.expanduser on Winodws is:

The above code ensure that we only use SHGetFolderPathW when SHGetKnownFolderPathW is not available in the system. The reasoning for that is that SHGetFolderPathW is deprecated and new applications are encourage to use SHGetKnownFolderPathW.

A much better solution is to patch ntpath.py so that is something like what I propose for Ubuntu One. Does anyone know if this is fixed in Python 3? Shall I propose a fix?

On Ubuntu One we use BtiRock to create the packages for Windows. One of the new features I’m working on is to check if there are updates every so often so that the user gets notified. This code on Ubuntu is not needed because the Update Manger takes care of that, but when you work in an inferior OS…

Generate the auto-update.exe

In order to check for updates we use the generated auto-update.exe wizard from BitRock. Generating the wizard is very straight forward first, as with most of the BitRock stuff, we generate the XML to configure the generated .exe.

There is just a single thing that is worth mentioning about the above XML. The requireInstallationByRootUser is true because we use the generated .exe to check if there are updates present and we do not what the user to have to be root for that, it does not make sense. Once you have the above or similar XML you can execute:

On of the features that I really like from Ubuntu One is the ability to have Read Only shares that will allow me to share files with some of my friends without them having the chance to change my files. In order to support that in a more explicit way on Windows we needed to be able to change the ACEs of an ACL from a file to stop the user from changing the files. In reality there is no need to change the ACEs since the server will ensure that the files are not changed, but as with python, is better to be explicit that to be implicit.

Our solution has the following details:

The file system is not using FAT.

We assume that the average user does not change the ACEs of a file usually.

If the user changes the ACEs he does not add any deny ACE.

We want to keep the already present ACEs.

The idea is very simple, we will add a ACE for the path that will remove the user the write rights so that we cannot edit/rename/delete a file and that he can only list the directories. The full code is the following:

USER_SID = LookupAccountName("", GetUserName())[0]def _add_deny_ace(path, rights):
"""Remove rights from a path for the given groups."""ifnotos.path.exists(path):
raiseWindowsError('Path %s could not be found.'% path)if rights isnotNone:
security_descriptor = GetFileSecurity(path, DACL_SECURITY_INFORMATION)
dacl = security_descriptor.GetSecurityDescriptorDacl()# set the attributes of the group only if not null
dacl.AddAccessDeniedAceEx(ACL_REVISION_DS,
CONTAINER_INHERIT_ACE | OBJECT_INHERIT_ACE, rights,
USER_SID)
security_descriptor.SetSecurityDescriptorDacl(1, dacl, 0)
SetFileSecurity(path, DACL_SECURITY_INFORMATION, security_descriptor)def _remove_deny_ace(path):
"""Remove the deny ace for the given groups."""ifnotos.path.exists(path):
raiseWindowsError('Path %s could not be found.'% path)
security_descriptor = GetFileSecurity(path, DACL_SECURITY_INFORMATION)
dacl = security_descriptor.GetSecurityDescriptorDacl()# if we delete an ace in the acl the index is outdated and we have# to ensure that we do not screw it up. We keep the number of deleted# items to update accordingly the index.
num_delete = 0for index inrange(0, dacl.GetAceCount()):
ace = dacl.GetAce(index - num_delete)# check if the ace is for the user and its type is 1, that means# is a deny ace and we added it, lets remove itif USER_SID == ace[2]and ace[0][0] == 1:
dacl.DeleteAce(index - num_delete)
num_delete += 1
security_descriptor.SetSecurityDescriptorDacl(1, dacl, 0)
SetFileSecurity(path, DACL_SECURITY_INFORMATION, security_descriptor)def set_no_rights(path):
"""Set the rights for 'path' to be none.
Set the groups to be empty which will remove all the rights of the file.
"""os.chmod(path, 0o000)
rights = FILE_ALL_ACCESS
_add_deny_ace(path, rights)def set_file_readonly(path):
"""Change path permissions to readonly in a file."""# we use the win32 api because chmod just sets the readonly flag and# we want to have more control over the permissions
rights = FILE_WRITE_DATA | FILE_APPEND_DATA | FILE_GENERIC_WRITE
# the above equals more or less to 0444
_add_deny_ace(path, rights)def set_file_readwrite(path):
"""Change path permissions to readwrite in a file."""# the above equals more or less to 0774
_remove_deny_ace(path)os.chmod(path, stat.S_IWRITE)def set_dir_readonly(path):
"""Change path permissions to readonly in a dir."""
rights = FILE_WRITE_DATA | FILE_APPEND_DATA
# the above equals more or less to 0444
_add_deny_ace(path, rights)def set_dir_readwrite(path):
"""Change path permissions to readwrite in a dir.
Helper that receives a windows path.
"""# the above equals more or less to 0774
_remove_deny_ace(path)# remove the read only flagos.chmod(path, stat.S_IWRITE)

Adding the Deny ACE

The idea of the code is very simple, we will add a Deny ACE to the path so that the user cannot write it. The Deny ACE is different if it is a file or a directory since we want the user to be able to list the contents of a directory.

3
4
5
6
7
8
9
10
11
12
13
14
15
16

def _add_deny_ace(path, rights):
"""Remove rights from a path for the given groups."""ifnotos.path.exists(path):
raiseWindowsError('Path %s could not be found.'% path)if rights isnotNone:
security_descriptor = GetFileSecurity(path, DACL_SECURITY_INFORMATION)
dacl = security_descriptor.GetSecurityDescriptorDacl()# set the attributes of the group only if not null
dacl.AddAccessDeniedAceEx(ACL_REVISION_DS,
CONTAINER_INHERIT_ACE | OBJECT_INHERIT_ACE, rights,
USER_SID)
security_descriptor.SetSecurityDescriptorDacl(1, dacl, 0)
SetFileSecurity(path, DACL_SECURITY_INFORMATION, security_descriptor)

Remove the Deny ACE

Very similar to the above but doing the opposite, lets remove the Deny ACES present for the current user. If you notice we store how many we removed, the reason is simple, if we remove an ACE the index is no longer valid so we have to calculate the correct one by knowing how many we removed.

19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

def _remove_deny_ace(path):
"""Remove the deny ace for the given groups."""ifnotos.path.exists(path):
raiseWindowsError('Path %s could not be found.'% path)
security_descriptor = GetFileSecurity(path, DACL_SECURITY_INFORMATION)
dacl = security_descriptor.GetSecurityDescriptorDacl()# if we delete an ace in the acl the index is outdated and we have# to ensure that we do not screw it up. We keep the number of deleted# items to update accordingly the index.
num_delete = 0for index inrange(0, dacl.GetAceCount()):
ace = dacl.GetAce(index - num_delete)# check if the ace is for the user and its type is 1, that means# is a deny ace and we added it, lets remove itif USER_SID == ace[2]and ace[0][0] == 1:
dacl.DeleteAce(index - num_delete)
num_delete += 1
security_descriptor.SetSecurityDescriptorDacl(1, dacl, 0)
SetFileSecurity(path, DACL_SECURITY_INFORMATION, security_descriptor)

Implement access

Our access implementation takes into account the Deny ACE added to ensure that we do not only look at the flags.

def access(path):
"""Return if the path is at least readable."""# lets consider the access on an illegal path to be a special case# since that will only occur in the case where the user created the path# for a file to be readable it has to be readable either by the user or# by the everyone group# XXX: ENOPARSE ^ (nessita)ifnotos.path.exists(path):
returnFalse
security_descriptor = GetFileSecurity(path, DACL_SECURITY_INFORMATION)
dacl = security_descriptor.GetSecurityDescriptorDacl()for index inrange(0, dacl.GetAceCount()):
# add the sid of the ace if it can read to test that we remove# the r bitmask and test if the bitmask is the same, if not, it means# we could read and removed it.
ace = dacl.GetAce(index)if USER_SID == ace[2]and ace[0][0] == 1:
# check wich access is deniedif ace[1] | FILE_GENERIC_READ == ace[1]or\
ace[1] | FILE_ALL_ACCESS == ace[1]:
returnFalsereturnTrue

Implement can_write

The following code is similar to access but checks if we have a readonly file.

def can_write(path):
"""Return if the path is at least readable."""# lets consider the access on an illegal path to be a special case# since that will only occur in the case where the user created the path# for a file to be readable it has to be readable either by the user or# by the everyone group# XXX: ENOPARSE ^ (nessita)ifnotos.path.exists(path):
returnFalse
security_descriptor = GetFileSecurity(path, DACL_SECURITY_INFORMATION)
dacl = security_descriptor.GetSecurityDescriptorDacl()for index inrange(0, dacl.GetAceCount()):
# add the sid of the ace if it can read to test that we remove# the r bitmask and test if the bitmask is the same, if not, it means# we could read and removed it.
ace = dacl.GetAce(index)if USER_SID == ace[2]and ace[0][0] == 1:
if ace[1] | FILE_GENERIC_WRITE == ace[1]or\
ace[1] | FILE_WRITE_DATA == ace[1]or\
ace[1] | FILE_APPEND_DATA == ace[1]or\
ace[1] | FILE_ALL_ACCESS == ace[1]:
# check wich access is deniedreturnFalsereturnTrue

Last week was probably one of the best coding sprints I have had since I started working in Canonical, I’m serious!. I had the luck to pair program with alecu on the FilesystemMonitor that we use in Ubuntu One on windows. The implementation has improved so much that I wanted to blog about it and show it as an example of how to hook the ReadDirectoryChangesW call from COM into twisted so that you can process the events using twisted which is bloody cool.

We have reduce the implementation of the Watch and WatchManager to match our needs and reduce the API provided since we do not use all the API provided by pyinotify. The Watcher implementation is as follows:

class Watch(object):
"""Implement the same functions as pyinotify.Watch."""def__init__(self, watch_descriptor, path, mask, auto_add, processor,
buf_size=8192):
super(Watch, self).__init__()self.log = logging.getLogger('ubuntuone.SyncDaemon.platform.windows.' +
'filesystem_notifications.Watch')self.log.setLevel(TRACE)self._processor = processor
self._buf_size = buf_size
self._wait_stop = CreateEvent(None, 0, 0, None)self._overlapped = OVERLAPPED()self._overlapped.hEvent = CreateEvent(None, 0, 0, None)self._watching = Falseself._descriptor = watch_descriptor
self._auto_add = auto_add
self._ignore_paths = []self._cookie = Noneself._source_pathname = Noneself._process_thread = None# remember the subdirs we have so that when we have a delete we can# check if it was a removeself._subdirs = []# ensure that we work with an abspath and that we can deal with# long paths over 260 chars.ifnot path.endswith(os.path.sep):
path += os.path.sepself._path = os.path.abspath(path)self._mask = mask
# this deferred is fired when the watch has started monitoring# a directory from a threadself._watch_started_deferred = defer.Deferred()
@is_valid_windows_path(path_indexes=[1])def _path_is_dir(self, path):
"""Check if the path is a dir and update the local subdir list."""self.log.debug('Testing if path %r is a dir', path)
is_dir = Falseifos.path.exists(path):
is_dir = os.path.isdir(path)else:
self.log.debug('Path "%s" was deleted subdirs are %s.',
path, self._subdirs)# we removed the path, we look in the internal listif path inself._subdirs:
is_dir = Trueself._subdirs.remove(path)if is_dir:
self.log.debug('Adding %s to subdirs %s', path, self._subdirs)self._subdirs.append(path)return is_dir
def _process_events(self, events):
"""Process the events form the queue."""# do not do it if we stop watching and the events are emptyifnotself._watching:
return# we transform the events to be the same as the one in pyinotify# and then use the proc_funfor action, file_name in events:
ifany([file_name.startswith(path)for path inself._ignore_paths]):
continue# map the windows events to the pyinotify ones, tis is dirty but# makes the multiplatform better, linux was first :P
syncdaemon_path = get_syncdaemon_valid_path(os.path.join(self._path, file_name))
is_dir = self._path_is_dir(os.path.join(self._path, file_name))if is_dir:
self._subdirs.append(file_name)
mask = WINDOWS_ACTIONS[action]
head, tail = os.path.split(file_name)if is_dir:
mask |= IN_ISDIR
event_raw_data = {'wd': self._descriptor,
'dir': is_dir,
'mask': mask,
'name': tail,
'path': '.'}# by the way in which the win api fires the events we know for# sure that no move events will be added in the wrong order, this# is kind of hacky, I dont like it too muchif WINDOWS_ACTIONS[action] == IN_MOVED_FROM:
self._cookie = str(uuid4())self._source_pathname = tail
event_raw_data['cookie'] = self._cookie
if WINDOWS_ACTIONS[action] == IN_MOVED_TO:
event_raw_data['src_pathname'] = self._source_pathname
event_raw_data['cookie'] = self._cookie
event = Event(event_raw_data)# FIXME: event deduces the pathname wrong and we need to manually# set it
event.pathname = syncdaemon_path
# add the event only if we do not have an exclude filter or# the exclude filter returns False, that is, the event will not# be excludedself.log.debug('Event is %s.', event)self._processor(event)def _call_deferred(self, f, *args):
"""Executes the defeered call avoiding possible race conditions."""ifnotself._watch_started_deferred.called:
f(args)def _watch(self):
"""Watch a path that is a directory."""# we are going to be using the ReadDirectoryChangesW whihc requires# a directory handle and the mask to be used.
handle = CreateFile(self._path,
FILE_LIST_DIRECTORY,
FILE_SHARE_READ | FILE_SHARE_WRITE,
None,
OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
None)self.log.debug('Watching path %s.', self._path)whileTrue:
# important information to know about the parameters:# param 1: the handle to the dir# param 2: the size to be used in the kernel to store events# that might be lost while the call is being performed. This# is complicated to fine tune since if you make lots of watcher# you migh used too much memory and make your OS to BSOD
buf = AllocateReadBuffer(self._buf_size)try:
ReadDirectoryChangesW(
handle,
buf,
self._auto_add,
self._mask,
self._overlapped,
)
reactor.callFromThread(self._call_deferred,
self._watch_started_deferred.callback, True)except error:
# the handle is invalid, this may occur if we decided to# stop watching before we go in the loop, lets get out of it
reactor.callFromThread(self._call_deferred,
self._watch_started_deferred.errback, error)break# wait for an event and ensure that we either stop or read the# data
rc = WaitForMultipleObjects((self._wait_stop,
self._overlapped.hEvent),
0, INFINITE)if rc == WAIT_OBJECT_0:
# Stop eventbreak# if we continue, it means that we got some data, lets read it
data = GetOverlappedResult(handle, self._overlapped, True)# lets ead the data and store it in the results
events = FILE_NOTIFY_INFORMATION(buf, data)self.log.debug('Events from ReadDirectoryChangesW are %s', events)
reactor.callFromThread(self._process_events, events)
CloseHandle(handle)
@is_valid_windows_path(path_indexes=[1])def ignore_path(self, path):
"""Add the path of the events to ignore."""ifnot path.endswith(os.path.sep):
path += os.path.sepif path.startswith(self._path):
path = path[len(self._path):]self._ignore_paths.append(path)
@is_valid_windows_path(path_indexes=[1])def remove_ignored_path(self, path):
"""Reaccept path."""ifnot path.endswith(os.path.sep):
path += os.path.sepif path.startswith(self._path):
path = path[len(self._path):]if path inself._ignore_paths:
self._ignore_paths.remove(path)def start_watching(self):
"""Tell the watch to start processing events."""for current_child inos.listdir(self._path):
full_child_path = os.path.join(self._path, current_child)ifos.path.isdir(full_child_path):
self._subdirs.append(full_child_path)# start to diff threads, one to watch the path, the other to# process the events.self.log.debug('Start watching path.')self._watching = True
reactor.callInThread(self._watch)returnself._watch_started_deferred
def stop_watching(self):
"""Tell the watch to stop processing events."""self.log.info('Stop watching %s', self._path)
SetEvent(self._wait_stop)self._watching = Falseself._subdirs = []def update(self, mask, auto_add=False):
"""Update the info used by the watcher."""self.log.debug('update(%s, %s)', mask, auto_add)self._mask = mask
self._auto_add = auto_add
@propertydef path(self):
"""Return the patch watched."""returnself._path
@propertydef auto_add(self):
returnself._auto_add

The important details of this implementations are the following:

Use a deferred to notify that the watch started.

During or tests we noticed that the start watch function was slow which would mean that from the point when we start watching the directory and the point when the thread actually started we would be loosing events. The function now returns a deferred that will be fired when the ReadDirectoryChangesW has been called which ensures that no events will be lost. The interesting parts are the following:

define the deferred

31
32
33

# this deferred is fired when the watch has started monitoring# a directory from a threadself._watch_started_deferred = defer.Deferred()

except error:
# the handle is invalid, this may occur if we decided to# stop watching before we go in the loop, lets get out of it
reactor.callFromThread(self._call_deferred,
self._watch_started_deferred.errback, error)break

Threading and firing the reactor.

There is an interesting detail to take care of in this code. We have to ensure that the deferred is not called more than once, to do that you have to callFromThread a function that will fire the event only when it was not already fired like this:

If you do not do the above, but the code bellow you will have a race condition in which the deferred is called more than once.

buf = AllocateReadBuffer(self._buf_size)try:
ReadDirectoryChangesW(
handle,
buf,
self._auto_add,
self._mask,
self._overlapped,
)ifnotself._watch_started_deferred.called:
reactor.callFromThread(self._watch_started_deferred.callback, True)except error:
# the handle is invalid, this may occur if we decided to# stop watching before we go in the loop, lets get out of itifnotself._watch_started_deferred.called:
reactor.callFromThread(self._watch_started_deferred.errback, error)break

Execute the processing of events in the reactor main thread.

Alecu has bloody great ideas way too often, and this is one of his. The processing of the events is queued to be executed in the twisted reactor main thread which reduces the amount of threads we use and will ensure that the events are processed in the correct order.

153
154
155
156
157
158

# if we continue, it means that we got some data, lets read it
data = GetOverlappedResult(handle, self._overlapped, True)# lets ead the data and store it in the results
events = FILE_NOTIFY_INFORMATION(buf, data)self.log.debug('Events from ReadDirectoryChangesW are %s', events)
reactor.callFromThread(self._process_events, events)

Pywin32 is a very cool project that allows you to access the win api without having to go through ctypes and deal with all the crazy parameters that COM is famous for. Unfortunately sometimes it has som issues which you face only a few times in your life.

This case I found a bug where GetFileSecurity does not use the GetFileSecurityW method but the w-less version. For those who don’t have to deal with this terrible details, the W usually means that the functions knows how to deal with utf-8 strings (backward compatibility can be a problem sometimes). I have reported the bug but for those that are in a hurry here is the patch:

At the moment some of the tests (and I cannot point out which ones) of ubuntuone-client fail when they are ran on Windows. The reason for this is due to the way in which we get the notifications out of the file system and the way the tests are written. Before I blame the OS or the tests, let me explain a number of facts about the Windows filesystem and the possible ways to interact with it.

To be able to get file system changes from the OS the Win32 API provides the following:

This function was broken up to Vista when it was fixed, Unfortunately AFAIK we also support Windows XP which means that we cannot trust this function. On top of that taking this path means that we can have a performance issue. Because the function is build on top of Windows messages, if too many changes occur the sync daemon would start receiving roll up messages that just state that something changed and it would be up to the sync daemon to decide what really happened. Therefore we can all agree that this is a no no, right?

This is a really easy function to use which is based on ReadDirectoryChangesW (I think is a simple wrapper around it) that lets you know that something changed but gives no information about what changed. Because if is based on ReadDirectoryChangesW it suffers from the same issues.

This is by far the most common way to get the notification changes from the system. Now, in theory there are two possible cases which can go wrong that would affect the events raised by this function:

There are too many events and the buffer gets overloaded and we start loosing events. A simple way to solve this issues is to process the events in a diff thread asap so that we can keep reading the changes.

We use the sync version of the function which means that we could have the following issues:

Blue screen of death because we used too much memory from the kernel space.

We cannot close the handles used to watch the changes in the directories. This makes the threads to end up blocked.

As I mentioned this is the theory and therefore makes perfect sense to choose this option as the way to get notified by the changes until… you hit a great little feature of Windows called write-behind caching. The idea of write-behind caching is the following one:

When you attempt to write a new file on your HD Windows does not directly modify the HD. Instead it makes a not of the fact that your intention is to write on disk and saves your changes in memory. Ins’t that smart?

Well, that lovely feature does come set as default AFAIK from XP onwards. Any smart person would wonder how does that interact with FindFirstChangeNotification/ReadDirectoryChangesW, well after some work here is what I have managed to find out:

The IO Manager (internal to the kernel) is queueing up disk-write requests in an internal buffer, and the actual changes are not physically committed until some condition is met which I believe is for the “write-behind caching” feature. The problem appears to be that the user-space callback via FileSystemWatcher/ReadDirectoryChanges does not occur when disk-write requests are inserted into the queue, but rather occurs when they are leaving the queue and being physically committed to disk. For what I have been able to manage through observation, the life time of a queue is based on:

Whether more writes are being inserted in the q.

Is another app request a read from an item in the q.

This means that when using FileSystemWatcher/ReadDirectoryChanges the events are fired only when the changes are actually committed and as for a user-space program this follows a non-deterministic process (insert spanish swearing here). a way to work around this issue is to use the FluxhFileBuffers function on the volume, which does need admin rights, yeah!

Well, this allows to track the changes that have been committed in an NTFS system (that means that we do not have support to FAT). This technique allows to keep track of the changes using an update sequence number that keeps track of changes in an interesting manner. At first look, although parsing the data is hard, this solution seems to be very similar to the one used by pyinotify and therefore someone will say, hey, let just ell twisted to do a select on that file and read the changes. Well, no, is not that easy, files do not provide the functionality used for select, just sockets (http://msdn.microsoft.com/en-us/library/aa363803%28VS.85%29.aspx) /me jumps of happiness

Well, this is an easy one to summarize, you have to write a driver like piece of code. Means C, COM and being able to crash the entire system with a nice blue screen (although I can change the color to aubergine before we crash)

Conclusion

At this point I hope I have convinced a few to believe that ReadDirectoryChangesW is the best option to take but might be wondering why I mentioned the write-behind caching feature, well here comes my complain towards the tests. We do use the real file system notifications for testing and the trial test cases do have a timeout! Those two facts plus the lovely write-behind caching feature mean that the tests on Windows fail just because the bloody evens are not raise until the leave the q from the IO manager.

During the past few days I have been trying to track down an issue in the Ubuntu One client tests when ran on Windows that would use all the threads that the python process could have. As you can imaging finding out why there are deadlocks is quite hard, specially when I though that the code was thread safe, guess what? it wasn’t

The bug I had in the code was related to the way in which ReadDirectoryChangesW works. This functions has two different ways to be executed:

Synchronous

The ReadDirectoryChangesW can be executed in a sync mode by NOT providing a OVERLAPPED structure to perform the IO operations, for example:

If another thread attempts to close the handle while ReadDirectoryChangesW is waiting on it, the CloseHandle() method blocks (which has nothing to do with the GIL – it is correctly managed)

I got bitten in the ass by the second item which broke my tests in two different ways since it let thread block and a Handle used so that the rest of the tests could not remove the tmp directories that were under used by the block threads.

Asynchronous

In other to be able to use the async version of the function we just have to use an OVERLAPPED structure, this way the IO operations will no block and we will also be able to close the handle from a diff thread.

Using the ReadDirectoryW function in this way does solve all the other issues that are found on the sync version and the only extra overhead added is that you need to understand how to deal with COM events which is not that hard after you have worked with it for a little.

I leave this here for people that might find the same issue and for me to remember how much my ass hurt.

One of the things we wanted to achieve for the Windows port of Ubuntu One was to deploy to the users systems .exe files rather than requiring them to have python and all the different dependencies installed in their machine. There are different reasons we wanted to do this, but this post is not related to that. The goal of this post is to explain what to do when you are using py2exe and you depend on a package such as lazr.restfulclient.

Why lazr.restfulclient?

There are different reasons why I’m using lazr.restfulclient as an example:

It is a dependency we do have on Ubuntu One, and therefore I already have done the work with it.

It uses two features of setuptools that do not play well with py2exe:

It uses namespaced packages.

I uses pkg_resources to load resources used for the client.

Working around the use of namedspaced packages

This is actually a fairly easy thing to solve and it is well documented in the py2exe wiki, nevertheless I’d like to show it in this post so that the inclusion of the lazr.restfulclient is complete.

The main issue with namedspaced packages is that you have to tell the module finder from py2exe where to find those packages, which in our example are lazr.authentication, lazr.restfulclient and lazr.uri. A way to do that would be the following:

Adding the lazr resources

This is a more problematic issue to solve since we have to work around a limitation found in py2exe. The lazr.restfulcient tries to load a resource from the py2exe library.zip but as the zipfile is reserved for compiled files, and therefore the module fails. In py2exe there is no way to state that those resource files have to be copied to the library.zip which would mean that an error is raised at runtime when trying to use the lib but not at build time.

The best way (if not the only one) to solve this is to extend the py2exe command to copy the resource files to the folders that are zipped before they are embedded, that way pkg_resource will be able to load the file with no problems.

Before I introduce the code, let me say that this is not a 100% exact implementation of the interfaces that can be found in pyinotify but the implementation of a subset that matches my needs. The main idea of creating this post is to give an example of the implementation of such a library for Windows trying to reuse the code that can be found in pyinotify.

Once I have excused my self, let get into the code. First of all, there are a number of classes from pyinotify that we can use in our code. That subset of classes is the below code which I grabbed from pyinotify git:

Unfortunatly we need to implement the code that talks with the Win32 API to be able to retrieve the events in the file system. In my design this is done by the Watch class that looks like this:

# Author: Manuel de la Pena <manuel@canonical.com>## Copyright 2011 Canonical Ltd.## This program is free software: you can redistribute it and/or modify it# under the terms of the GNU General Public License version 3, as published# by the Free Software Foundation.## This program is distributed in the hope that it will be useful, but# WITHOUT ANY WARRANTY; without even the implied warranties of# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR# PURPOSE. See the GNU General Public License for more details.## You should have received a copy of the GNU General Public License along# with this program. If not, see <http://www.gnu.org/licenses/>."""File notifications on windows."""importloggingimportosimportreimport winerror
fromQueueimportQueue, Empty
fromthreadingimport Thread
from uuid import uuid4
from twisted.internetimport task, reactor
from win32con import(
FILE_SHARE_READ,
FILE_SHARE_WRITE,
FILE_FLAG_BACKUP_SEMANTICS,
FILE_NOTIFY_CHANGE_FILE_NAME,
FILE_NOTIFY_CHANGE_DIR_NAME,
FILE_NOTIFY_CHANGE_ATTRIBUTES,
FILE_NOTIFY_CHANGE_SIZE,
FILE_NOTIFY_CHANGE_LAST_WRITE,
FILE_NOTIFY_CHANGE_SECURITY,
OPEN_EXISTING
)from win32file import CreateFile, ReadDirectoryChangesW
from ubuntuone.platform.windows.pyinotifyimport(
Event,
WatchManagerError,
ProcessEvent,
PrintAllEvents,
IN_OPEN,
IN_CLOSE_NOWRITE,
IN_CLOSE_WRITE,
IN_CREATE,
IN_ISDIR,
IN_DELETE,
IN_MOVED_FROM,
IN_MOVED_TO,
IN_MODIFY,
IN_IGNORED
)from ubuntuone.syncdaemon.filesystem_notificationsimport(
GeneralINotifyProcessor
)from ubuntuone.platform.windows.os_helperimport(
LONG_PATH_PREFIX,
abspath,
listdir
)# constant found in the msdn documentation:# http://msdn.microsoft.com/en-us/library/ff538834(v=vs.85).aspx
FILE_LIST_DIRECTORY = 0x0001
FILE_NOTIFY_CHANGE_LAST_ACCESS = 0x00000020
FILE_NOTIFY_CHANGE_CREATION = 0x00000040
# a map between the few events that we have on windows and those# found in pyinotify
WINDOWS_ACTIONS = {1: IN_CREATE,
2: IN_DELETE,
3: IN_MODIFY,
4: IN_MOVED_FROM,
5: IN_MOVED_TO
}# translates quickly the event and it's is_dir state to our standard events
NAME_TRANSLATIONS = {
IN_OPEN: 'FS_FILE_OPEN',
IN_CLOSE_NOWRITE: 'FS_FILE_CLOSE_NOWRITE',
IN_CLOSE_WRITE: 'FS_FILE_CLOSE_WRITE',
IN_CREATE: 'FS_FILE_CREATE',
IN_CREATE | IN_ISDIR: 'FS_DIR_CREATE',
IN_DELETE: 'FS_FILE_DELETE',
IN_DELETE | IN_ISDIR: 'FS_DIR_DELETE',
IN_MOVED_FROM: 'FS_FILE_DELETE',
IN_MOVED_FROM | IN_ISDIR: 'FS_DIR_DELETE',
IN_MOVED_TO: 'FS_FILE_CREATE',
IN_MOVED_TO | IN_ISDIR: 'FS_DIR_CREATE',
}# the default mask to be used in the watches added by the FilesystemMonitor# class
FILESYSTEM_MONITOR_MASK = FILE_NOTIFY_CHANGE_FILE_NAME | \
FILE_NOTIFY_CHANGE_DIR_NAME | \
FILE_NOTIFY_CHANGE_ATTRIBUTES | \
FILE_NOTIFY_CHANGE_SIZE | \
FILE_NOTIFY_CHANGE_LAST_WRITE | \
FILE_NOTIFY_CHANGE_SECURITY | \
FILE_NOTIFY_CHANGE_LAST_ACCESS
# The implementation of the code that is provided as the pyinotify# substituteclass Watch(object):
"""Implement the same functions as pyinotify.Watch."""def__init__(self, watch_descriptor, path, mask, auto_add,
events_queue=None, exclude_filter=None, proc_fun=None):
super(Watch, self).__init__()self.log = logging.getLogger('ubuntuone.platform.windows.' +
'filesystem_notifications.Watch')self._watching = Falseself._descriptor = watch_descriptor
self._auto_add = auto_add
self.exclude_filter = Noneself._proc_fun = proc_fun
self._cookie = Noneself._source_pathname = None# remember the subdirs we have so that when we have a delete we can# check if it was a removeself._subdirs = []# ensure that we work with an abspath and that we can deal with# long paths over 260 chars.self._path = os.path.abspath(path)ifnotself._path.startswith(LONG_PATH_PREFIX):
self._path = LONG_PATH_PREFIX + self._path
self._mask = mask
# lets make the q as big as possibleself._raw_events_queue = Queue()ifnot events_queue:
events_queue = Queue()self.events_queue = events_queue
def _path_is_dir(self, path):
""""Check if the path is a dir and update the local subdir list."""self.log.debug('Testing if path "%s" is a dir', path)
is_dir = Falseifos.path.exists(path):
is_dir = os.path.isdir(path)else:
self.log.debug('Path "%s" was deleted subdirs are %s.',
path, self._subdirs)# we removed the path, we look in the internal listif path inself._subdirs:
is_dir = Trueself._subdirs.remove(path)if is_dir:
self.log.debug('Adding %s to subdirs %s', path, self._subdirs)self._subdirs.append(path)return is_dir
def _process_events(self):
"""Process the events form the queue."""# we transform the events to be the same as the one in pyinotify# and then use the proc_funwhileself._watching ornotself._raw_events_queue.empty():
file_name, action = self._raw_events_queue.get()# map the windows events to the pyinotify ones, tis is dirty but# makes the multiplatform better, linux was first :P
is_dir = self._path_is_dir(file_name)ifos.path.exists(file_name):
is_dir = os.path.isdir(file_name)else:
# we removed the path, we look in the internal listif file_name inself._subdirs:
is_dir = Trueself._subdirs.remove(file_name)if is_dir:
self._subdirs.append(file_name)
mask = WINDOWS_ACTIONS[action]
head, tail = os.path.split(file_name)if is_dir:
mask |= IN_ISDIR
event_raw_data = {'wd': self._descriptor,
'dir': is_dir,
'mask': mask,
'name': tail,
'path': head.replace(self.path, '.')}# by the way in which the win api fires the events we know for# sure that no move events will be added in the wrong order, this# is kind of hacky, I dont like it too muchif WINDOWS_ACTIONS[action] == IN_MOVED_FROM:
self._cookie = str(uuid4())self._source_pathname = tail
event_raw_data['cookie'] = self._cookie
if WINDOWS_ACTIONS[action] == IN_MOVED_TO:
event_raw_data['src_pathname'] = self._source_pathname
event_raw_data['cookie'] = self._cookie
event = Event(event_raw_data)# FIXME: event deduces the pathname wrong and we need manually# set it
event.pathname = file_name
# add the event only if we do not have an exclude filter or# the exclude filter returns False, that is, the event will not# be excludedifnotself.exclude_filterornotself.exclude_filter(event):
self.log.debug('Addding event %s to queue.', event)self.events_queue.put(event)def _watch(self):
"""Watch a path that is a directory."""# we are going to be using the ReadDirectoryChangesW whihc requires# a direcotry handle and the mask to be used.
handle = CreateFile(self._path,
FILE_LIST_DIRECTORY,
FILE_SHARE_READ | FILE_SHARE_WRITE,
None,
OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS,
None)self.log.debug('Watchng path %s.', self._path)whileself._watching:
# important information to know about the parameters:# param 1: the handle to the dir# param 2: the size to be used in the kernel to store events# that might be lost whilw the call is being performed. This# is complicates to fine tune since if you make lots of watcher# you migh used to much memory and make your OS to BSOD
results = ReadDirectoryChangesW(
handle,
1024,
self._auto_add,
self._mask,
None,
None)# add the diff events to the q so that the can be processed no# matter the speed.for action, filein results:
full_filename = os.path.join(self._path, file)self._raw_events_queue.put((full_filename, action))self.log.debug('Added %s to raw events queue.',
(full_filename, action))def start_watching(self):
"""Tell the watch to start processing events."""# get the diff dirs in the pathfor current_child in listdir(self._path):
full_child_path = os.path.join(self._path, current_child)ifos.path.isdir(full_child_path):
self._subdirs.append(full_child_path)# start to diff threads, one to watch the path, the other to# process the events.self.log.debug('Sart watching path.')self._watching = True
watch_thread = Thread(target=self._watch,
name='Watch(%s)'%self._path)
process_thread = Thread(target=self._process_events,
name='Process(%s)'%self._path)
process_thread.start()
watch_thread.start()def stop_watching(self):
"""Tell the watch to stop processing events."""self._watching = Falseself._subdirs = []def update(self, mask, proc_fun=None, auto_add=False):
"""Update the info used by the watcher."""self.log.debug('update(%s, %s, %s)', mask, proc_fun, auto_add)self._mask = mask
self._proc_fun = proc_fun
self._auto_add = auto_add
@propertydef path(self):
"""Return the patch watched."""returnself._path
@propertydef auto_add(self):
returnself._auto_add
@propertydef proc_fun(self):
returnself._proc_fun
class WatchManager(object):
"""Implement the same functions as pyinotify.WatchManager."""def__init__(self, exclude_filter=lambda path: False):
"""Init the manager to keep trak of the different watches."""super(WatchManager, self).__init__()self.log = logging.getLogger('ubuntuone.platform.windows.'
+ 'filesystem_notifications.WatchManager')self._wdm = {}self._wd_count = 0self._exclude_filter = exclude_filter
self._events_queue = Queue()self._ignored_paths = []def stop(self):
"""Close the manager and stop all watches."""self.log.debug('Stopping watches.')for current_wd inself._wdm:
self._wdm[current_wd].stop_watching()self.log.debug('Watch for %s stopped.', self._wdm[current_wd].path)def get_watch(self, wd):
"""Return the watch with the given descriptor."""returnself._wdm[wd]def del_watch(self, wd):
"""Delete the watch with the given descriptor."""try:
watch = self._wdm[wd]
watch.stop_watching()delself._wdm[wd]self.log.debug('Watch %s removed.', wd)exceptKeyError, e:
logging.error(str(e))def _add_single_watch(self, path, mask, proc_fun=None, auto_add=False,
quiet=True, exclude_filter=None):
self.log.debug('add_single_watch(%s, %s, %s, %s, %s, %s)', path, mask,
proc_fun, auto_add, quiet, exclude_filter)self._wdm[self._wd_count] = Watch(self._wd_count, path, mask,
auto_add, events_queue=self._events_queue,
exclude_filter=exclude_filter, proc_fun=proc_fun)self._wdm[self._wd_count].start_watching()self._wd_count += 1self.log.debug('Watch count increased to %s', self._wd_count)def add_watch(self, path, mask, proc_fun=None, auto_add=False,
quiet=True, exclude_filter=None):
ifhasattr(path, '__iter__'):
self.log.debug('Added collection of watches.')# we are dealing with a collection of pathsfor current_path in path:
ifnotself.get_wd(current_path):
self._add_single_watch(current_path, mask, proc_fun,
auto_add, quiet, exclude_filter)elifnotself.get_wd(path):
self.log.debug('Adding single watch.')self._add_single_watch(path, mask, proc_fun, auto_add,
quiet, exclude_filter)def update_watch(self, wd, mask=None, proc_fun=None, rec=False,
auto_add=False, quiet=True):
try:
watch = self._wdm[wd]
watch.stop_watching()self.log.debug('Stopped watch on %s for update.', watch.path)# update the data and restart watching
auto_add = auto_add or rec
watch.update(mask, proc_fun=proc_fun, auto_add=auto_add)# only start the watcher again if the mask was given, otherwhise# we are not watchng and therefore do not careif mask:
watch.start_watching()exceptKeyError, e:
self.log.error(str(e))ifnot quiet:
raise WatchManagerError('Watch %s was not found'% wd, {})def get_wd(self, path):
"""Return the watcher that is used to watch the given path."""for current_wd inself._wdm:
ifself._wdm[current_wd].pathin path and \
self._wdm[current_wd].auto_add:
return current_wd
def get_path(self, wd):
"""Return the path watched by the wath with the given wd."""
watch_ = self._wmd.get(wd)if watch:
return watch.pathdef rm_watch(self, wd, rec=False, quiet=True):
"""Remove the the watch with the given wd."""try:
watch = self._wdm[wd]
watch.stop_watching()delself._wdm[wd]except KeyrError, err:
self.log.error(str(err))ifnot quiet:
raise WatchManagerError('Watch %s was not found'% wd, {})def rm_path(self, path):
"""Remove a watch to the given path."""# it would be very tricky to remove a subpath from a watcher that is# looking at changes in ther kids. To make it simpler and less error# prone (and even better performant since we use less threads) we will# add a filter to the events in the watcher so that the events from# that child are not received :)def ignore_path(event):
"""Ignore an event if it has a given path."""
is_ignored = Falsefor ignored_path inself._ignored_paths:
if ignore_path in event.pathname:
returnTruereturnFalse
wd = self.get_wd(path)if wd:
ifself._wdm[wd].path == path:
self.log.debug('Removing watch for path "%s"', path)self.rm_watch(wd)else:
self.log.debug('Adding exclude filter for "%s"', path)# we have a watch that cotains the path as a child pathifnot path inself._ignored_paths:
self._ignored_paths.append(path)# FIXME: This assumes that we do not have other function# which in our usecase is correct, but what is we move this# to other projects evet?!? Maybe using the manager# exclude_filter is betterifnotself._wdm[wd].exclude_filter:
self._wdm[wd].exclude_filter = ignore_path
@propertydef watches(self):
"""Return a reference to the dictionary that contains the watches."""returnself._wdm
@propertydef events_queue(self):
"""Return the queue with the events that the manager contains."""returnself._events_queue
class Notifier(object):
"""
Read notifications, process events. Inspired by the pyinotify.Notifier
"""def__init__(self, watch_manager, default_proc_fun=None, read_freq=0,
threshold=10, timeout=-1):
"""Init to process event according to the given timeout & threshold."""super(Notifier, self).__init__()self.log = logging.getLogger('ubuntuone.platform.windows.'
+ 'filesystem_notifications.Notifier')# Watch Manager instanceself._watch_manager = watch_manager
# Default processing methodself._default_proc_fun = default_proc_fun
if default_proc_fun isNone:
self._default_proc_fun = PrintAllEvents()# Loop parametersself._read_freq = read_freq
self._threshold = threshold
self._timeout = timeout
def proc_fun(self):
returnself._default_proc_fun
def process_events(self):
"""
Process the event given the threshold and the timeout.
"""self.log.debug('Processing events with threashold: %s and timeout: %s',
self._threshold, self._timeout)# we will process an amount of events equal to the threshold of# the notifier and will block for the amount given by the timeout
processed_events = 0while processed_events <self._threshold:
try:
raw_event = Noneifnotself._timeout orself._timeout <0:
raw_event = self._watch_manager.events_queue.get(
block=False)else:
raw_event = self._watch_manager.events_queue.get(
timeout=self._timeout)
watch = self._watch_manager.get_watch(raw_event.wd)if watch isNone:
# Not really sure how we ended up here, nor how we should# handle these types of events and if it is appropriate to# completly skip them (like we are doing here).self.log.warning('Unable to retrieve Watch object '
+ 'associated to %s', raw_event)
processed_events += 1continueif watch and watch.proc_fun:
self.log.debug('Executing proc_fun from watch.')
watch.proc_fun(raw_event)# user processingselse:
self.log.debug('Executing default_proc_fun')self._default_proc_fun(raw_event)
processed_events += 1except Empty:
# increase the number of processed events, and continue
processed_events += 1continuedef stop(self):
"""Stop processing events and the watch manager."""self._watch_manager.stop()

While one of the threads is retrieving the events from the file system, the second one process them so that the will be exposed as pyinotify events. I have done so because I did not want to deal with OVERLAP structures for asyn operations in Win32 and because I wanted to use pyinotify events so that if someone with experience in pyinotify looks at the output, he can easily understand it. I really like this approach because it allowed me to reuse a fair amount of logic hat we had in the Ubuntu client and to approach the port in a very TDD way since the tests I’ve used are the same ones as the ones found on Ubuntu

Yet again Windows has presented me a challenge when trying to work with its file system, this time in the form of lock files. The Ubuntu One client on linux uses pyinotify to be able to listen to the file system events this, for example, allows the daemon to be updating your files when a new version has been created without the direct intervention of the user.

Although Windows does not have pyinotify (for obvious reasons) a developer that wants to perform such a directory monitoring can rely on the ReadDirectoryChangesW function. This function provides a similar behavior but unfortunately the information it provides is limited when compared with the one from pyinotify. On one hand, there are less events you can listen on Windows (IN_OPEN and IN_CLOSE for example are not present) but it also provides very little information by just giving 5 actions back, that is while on Windows you can listen to:

FILE_NOTIFY_CHANGE_FILE_NAME

FILE_NOTIFY_CHANGE_DIR_NAME

FILE_NOTIFY_CHANGE_ATTRIBUTES

FILE_NOTIFY_CHANGE_SIZE

FILE_NOTIFY_CHANGE_LAST_WRITE

FILE_NOTIFY_CHANGE_LAST_ACCESS

FILE_NOTIFY_CHANGE_CREATION

FILE_NOTIFY_CHANGE_SECURITY

You will only get back 5 values which are integers that represent the action that was performed. YesterdayI decide to see if it was possible to query the Windows Object Manager to see the currently used FILE HANDLES which would returned the open files. My idea was to write such a function and the pool (ouch!) to find when a file was opened or close. The result of such an attempt is the following:

The above code uses undocumented functions from the ntdll which I supposed Microsoft does not want me to use. An while it works, the solution does no scale since the process of querying the Object Manager is vey expensive and can rocket your CPU if performed several times. Nevertheless the above code works correctly and could be used to write a tools similar to those written by sysinternals.

I hope someone will find a use for the code, in my case it is code that I’ll have to throw away

In the last post I explained how to set the security attributes of a file on Windows. What naturally follows such a post is explaining how to implement the os.access method that takes into account such settings because the default implementation of python will ignore them. Lets first define when does a user have read access in our use case:

I user has read access if the user sid has read access our the sid of the ‘Everyone’ group has read access.

The above also includes any type of configuration like rw or rx. In order to be able to do this we have to understand how does Windows NT set the security of a file. On Windows NT the security of a file is set by using a bitmask of type DWORD which can be compared to a 32 bit unsigned long in ANSI C, and this is as far as the normal things go, let continue with the bizarre Windows implementation. For some reason I cannot understand the Windows developers rather than going with the more intuitive solution of using a bit per right, they instead, have decided to use a combination of bits per right. For example, to set the read flag 5 bits have to be set, for the write flag they use 6 bits and for the execute 4 bits are used. To make matters more simple the used bitmask overlap, that is if we remove the read flag we will be removing bit for the execute mask, and there is no documentation to be found about the different masks that are used…

Thankfully for use the cfengine project has had to go through this process already and by trial an error discovered the exact bits that provide the read rights. Such a magic number is:

0xFFFFFFF6

Therefore we can easily and this flag to an existing right to remove the read flag. The number also means that the only import bit that we are interested in are bits 0 and 3 which when set mean that the read flag was added. To make matters more complicated the ‘Full Access’ rights does not use such flag. In order to know if a user has the Full Access rights we have to look at bit 28 which if set does represent the ‘Full Access’ flag.

So to summarize, to know if a user has the read flag we have to look at bit 28 to test for the ‘Full Access’ flag, if the ‘Full Access’ was not granted we have to look at bits 0 and 3 and when both of them are set the usre has the read flag, easy right . Now to the practical example, the bellow code does exactly what I just explained using python and the win32api and win32security modules.

from win32api import GetUserName
from win32security import(
LookupAccountName,
LookupAccountSid,
GetFileSecurity,
SetFileSecurity,
ACL,
DACL_SECURITY_INFORMATION,
ACL_REVISION
)from ntsecuritycon import(
FILE_ALL_ACCESS,
FILE_GENERIC_EXECUTE,
FILE_GENERIC_READ,
FILE_GENERIC_WRITE,
FILE_LIST_DIRECTORY
)platform = 'win32'
EVERYONE_GROUP = 'Everyone'
ADMINISTRATORS_GROUP = 'Administrators'def _int_to_bin(n):
"""Convert an int to a bin string of 32 bits."""return"".join([str((n >> y)&1)for y inrange(32-1, -1, -1)])def _has_read_mask(number):
"""Return if the read flag is present."""# get the bin representation of the mask
binary = _int_to_bin(number)# there is actual no documentation of this in MSDN but if bt 28 is set,# the mask has full access, more info can be found here:# http://www.iu.hio.no/cfengine/docs/cfengine-NT/node47.htmlif binary[28] == '1':
returnTrue# there is no documentation in MSDN about this, but if bit 0 and 3 are true# we have the read flag, more info can be found here:# http://www.iu.hio.no/cfengine/docs/cfengine-NT/node47.htmlreturn binary[0] == '1'and binary[3] == '1'def access(path):
"""Return if the path is at least readable."""# for a file to be readable it has to be readable either by the user or# by the everyone group
security_descriptor = GetFileSecurity(path, DACL_SECURITY_INFORMATION)
dacl = security_descriptor.GetSecurityDescriptorDacl()
sids = []for index inrange(0, dacl.GetAceCount()):
# add the sid of the ace if it can read to test that we remove# the r bitmask and test if the bitmask is the same, if not, it means# we could read and removed it.
ace = dacl.GetAce(index)if _has_read_mask(ace[1]):
sids.append(ace[2])
accounts = [LookupAccountSid('',x)[0]for x in sids]return GetUserName()in accounts or EVERYONE_GROUP in accounts

When I wrote this my brain was in a WTF state so I’m sure that the horrible _int_to_bin function can be exchanged by the bin build in function from python. If you fancy doing it I would greatly appreciate it I cannot take this any longer

While working on making the Ubuntu One code more multiplatform I founded myself having to write some code that would set the attributes of a file on Windows. Ideally os.chmod would do the trick, but of course this is windows, and it is not fully supported. According to the python documentation:

Note: Although Windows supports chmod(), you can only set the file’s read-only flag with it (via the stat.S_IWRITE and stat.S_IREAD constants or a corresponding integer value). All other bits are ignored.

Grrrreat… To solve this issue I have written a small function that will allow to set the attributes of a file by using the win32api and win32security modules. This solves partially the issues since 0444 and others cannot be perfectly map to the Windows world. In my code I have made the assumption that using the groups ‘Everyone’, ‘Administrators’ and the user name would be close enough for our use cases.

Here is the code in case anyone has to go through this:

from win32api import MoveFileEx, GetUserName
from win32file import(
MOVEFILE_COPY_ALLOWED,
MOVEFILE_REPLACE_EXISTING,
MOVEFILE_WRITE_THROUGH
)from win32security import(
LookupAccountName,
GetFileSecurity,
SetFileSecurity,
ACL,
DACL_SECURITY_INFORMATION,
ACL_REVISION
)from ntsecuritycon import(
FILE_ALL_ACCESS,
FILE_GENERIC_EXECUTE,
FILE_GENERIC_READ,
FILE_GENERIC_WRITE,
FILE_LIST_DIRECTORY
)
EVERYONE_GROUP = 'Everyone'
ADMINISTRATORS_GROUP = 'Administrators'def _get_group_sid(group_name):
"""Return the SID for a group with the given name."""return LookupAccountName('', group_name)[0]def _set_file_attributes(path, groups):
"""Set file attributes using the wind32api."""
security_descriptor = GetFileSecurity(path, DACL_SECURITY_INFORMATION)
dacl = ACL()for group_name in groups:
# set the attributes of the group only if not nullif groups[group_name]:
group_sid = _get_group_sid(group_name)
dacl.AddAccessAllowedAce(ACL_REVISION, groups[group_name],
group_sid)# the dacl has all the info of the dff groups passed in the parameters
security_descriptor.SetSecurityDescriptorDacl(1, dacl, 0)
SetFileSecurity(path, DACL_SECURITY_INFORMATION, security_descriptor)def set_file_readonly(path):
"""Change path permissions to readonly in a file."""# we use the win32 api because chmod just sets the readonly flag and# we want to have imore control over the permissions
groups = {}
groups[EVERYONE_GROUP] = FILE_GENERIC_READ
groups[ADMINISTRATORS_GROUP] = FILE_GENERIC_READ
groups[GetUserName()] = FILE_GENERIC_READ
# the above equals more or less to 0444
_set_file_attributes(path, groups)

For those who might want to remove the read access from a group, you just have to not pass the group in the groups parameter which would remove the group from the security descriptor.

At the moment I am sprinting in Argentina trying to make the Ubuntu One port to Windows better by adding support to the sync daemon used on Linux. While the rest of the guys are focused in accomodating the current code to my “multiplatform” requirements, I’m working on getting a number of missing parts to work on windows. One of this parts is the lack of network manager on Windows.

One of the things we need to know to coninusly sync you files on windows is to get an event when your network is present, or dies. As usual this is far easier on Linux than on Windows. To get this event you have to implement the ISesNetwork interface from COM that will allow your object to register to network status changes. Due to the absolute lack of examples on the net (or how bad google is getting ) I’ve decided to share the code I managed to get working:

The above code represents a NetworkManager class that will execute a callback according to the event that was raised by the sens subsystem. It is important to note that in the above code the ‘Connected’ event will be fired 3 times since we registered to three different connect events while it will fire a single ‘Disconnected’ event. The way to fix this would be to register just to a single event according to the windows system you are running on, but since we do not care in the Ubuntu One sync daemon, well I left it there so everyone can see it

Sometimes on Linux we take for granted DBus. On the Ubntu One Windows port we have had to deal with the fact that DBus on Windows is not that great and therefore had to write our own IPC between the python code and the c# code. To solve the IPC we have done the following:

Listen to a named pipe from C#

The approach we have followed here is pretty simple, we create a thread pool that will create NamedPipe. The reason for using a threadpool is to avoid the situation in which we only have a single thread dealing with the messages from python and we have a very chatty python developer. The code in c# is very straight forward:

/*
* Copyright 2010 Canonical Ltd.
*
* This file is part of UbuntuOne on Windows.
*
* UbuntuOne on Windows is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License version
* as published by the Free Software Foundation.
*
* Ubuntu One on Windows is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with UbuntuOne for Windows. If not, see <http://www.gnu.org/licenses/>.
*
* Authors: Manuel de la Peña <manuel.delapena@canonical.com>
*/usingSystem;usingSystem.IO;usingSystem.IO.Pipes;usingSystem.Threading;usinglog4net;namespace Canonical.UbuntuOne.ProcessDispatcher{/// <summary>/// This oject represents a listener that will be waiting for messages/// from the python code and will perform an operation for each messages/// that has been recived. /// </summary>internalclass PipeListener : IPipeListener
{#region Helper strcut/// <summary>/// Private structure used to pass the start of the listener to the /// different listening threads./// </summary>privatestruct PipeListenerState
{#region Variablesprivatereadonlystring _namedPipe;privatereadonly Action<object> _callback;#endregion#region Properties/// <summary>/// Gets the named pipe to which the thread should listen./// </summary>publicstring NamedPipe { get {return _namedPipe;}}/// <summary>/// Gets the callback that the listening pipe should execute./// </summary>public Action<object> Callback { get {return _callback;}}#endregionpublic PipeListenerState(string namedPipe, Action<object> callback){
_namedPipe = namedPipe;
_callback = callback;}}#endregion#region Variablesprivatereadonlyobject _loggerLock =newobject();private ILog _logger;privatebool _isListening;privatereadonlyobject _isListeningLock =newobject();#endregion#region Properties/// <summary>/// Gets the logger to used with the object./// </summary>internal ILog Logger
{
get
{if(_logger ==null){lock(_loggerLock){
_logger = LogManager.GetLogger(typeof(PipeListener));}}return _logger;}
set
{
_logger = value;}}/// <summary>/// Gets if the pipe listener is indeed listening to the pipe./// </summary>publicbool IsListening
{
get {return _isListening;}private set
{// we have to lock to ensure that the threads do not screw each// other up, this makes a small step of the processing to be sync :(lock(_isListeningLock){
_isListening = value;}}}/// <summary>/// Gets and sets the number of threads that will be used to listen to the /// pipe. Each thread will listeng to connections and will dispatch the /// messages when ever they are done./// </summary>publicint NumberOfThreads { get; set;}/// <summary>/// Gets and sets the pipe stream factory that know how to generate the streamers used for the communication./// </summary>public IPipeStreamerFactory PipeStreamerFactory { get; set;}/// <summary>/// Gets and sets the action that will be performed with the message of that /// is received by the pipe listener./// </summary>public IMessageProcessor MessageProcessor { get; set;}#endregion#region Helpers/// <summary>/// Helper method that is used in another thread that will be listening to the possible events from /// the pipe./// </summary>privatevoid Listen(object state){
var namedPipeState =(PipeListenerState)state;try{
var threadNumber = Thread.CurrentThread.ManagedThreadId;// starts the named pipe since in theory it should not be present, if there is // a pipe already present we have an issue.using(var pipeServer =new NamedPipeServerStream(namedPipeState.NamedPipe, PipeDirection.InOut, NumberOfThreads,PipeTransmissionMode.Message,PipeOptions.Asynchronous)){
Logger.DebugFormat("Thread {0} listenitng to pipe {1}", threadNumber, namedPipeState.NamedPipe);// we wait until the python code connects to the pipe, we do not block the // rest of the app because we are in another thread.
pipeServer.WaitForConnection();
Logger.DebugFormat("Got clien connection in tread {0}", threadNumber);try{// create a streamer that know the protocol
var streamer = PipeStreamerFactory.Create();// Read the request from the client.
var message = streamer.Read(pipeServer);
Logger.DebugFormat("Message received to thread {0} is {1}", threadNumber, message);// execute the action that has to occur with the message
namedPipeState.Callback(message);}// Catch the IOException that is raised if the pipe is broken// or disconnected.catch(IOException e){
Logger.DebugFormat("Error in thread {0} when reading pipe {1}", threadNumber, e.Message);}}// if we are still listening, we will create a new thread to be used for listening,// otherwhise we will not and not lnger threads will be added. Ofcourse if the rest of the// threads do no add more than one work, we will have no issues with the pipe server since it// has been disposedif(IsListening){
ThreadPool.QueueUserWorkItem(Listen, namedPipeState);}}catch(PlatformNotSupportedException e){// are we running on an OS that does not have pipes (Mono on some os)
Logger.InfoFormat("Cannot listen to pipe {0}", namedPipeState.NamedPipe);}catch(IOException e){// there are too many servers listening to this pipe.
Logger.InfoFormat("There are too many servers listening to {0}", namedPipeState.NamedPipe);}}#endregion/// <summary>/// Starts listening to the different pipe messages and will perform the appropiate/// action when a message is received./// </summary>/// <param name="namedPipe">The name fof the pipe to listen.</param>publicvoid StartListening(string namedPipe){if(NumberOfThreads <0){thrownew PipeListenerException("The number of threads to use to listen to the pipe must be at least one.");}
IsListening = true;// we will be using a thread pool that will allow to have the different threads listening to // the messages of the pipes. There could be issues if the devel provided far to many threads// to listen to the pipe since the number of pipe servers is limited.for(var currentThreaCount =0; currentThreaCount < NumberOfThreads; currentThreaCount++){// we add an new thread to listen
ThreadPool.QueueUserWorkItem(Listen, new PipeListenerState(namedPipe, MessageProcessor.ProcessMessage));}}/// <summary>/// Stops listening to the different pipe messages. All the thread that are listening already will /// be forced to stop./// </summary>publicvoid StopListening(){
IsListening = false;}}}

Sending messages from python

Once the pipe server is listening in the .Net side we simple have to use the CallNamedPipe method to be able to send messages to .Net. In my case I have used Json as a stupid protocol, ideally you should do something smart like protobuffers.

It is not a secret that I love Spring.Net, it just makes the development of big application a pleasure. During the port of Ubuntu One to Windows I have been using the framework to initialise the WCF service that we use to provide other .Net applications the ability of communicating with Ubuntu One. Yes, this is our DBus alternative!

The idea behind using WCF is to allow other applications to use the different features that Ubuntu One provides, the very first application that we would like to use this would be Banshee on Windows (I have to start looking into that, but I have too much to do right now). In order to provide this functionality we use named pipes to allow the communication, there are two reasons for this:

For an application to host a WCF service that uses a binding besides the named pipe binding requires special permissions. This is clearly a no no for a user application like Ubuntu One.

Named pipes are dammed efficient!!! Named pipes on Windows are at the kernel level, cool

Initially I though of hosting the WCF services as a Windows services, why not?!?! Once I had this feature implemented, I realized the following. It turns out that while impersonation does get spawn within different threads, this is not the case for processes. This is a major pain in the ass. The main reason for this being a problem is the fact that if an application is executed in a different user space, the different env variables that are used are those of the user executing the code. This means that things like your user roming app dir will not be able to use, plus other security issues.

After realizing that the WCF services could not be hosted on a Windows service, I moved to write a work a round that would do the following:

Configure the WCF services to use named pipes only for the current user.

Start a console application that will host the WCF services.

Start the different WCF clients for Ubuntu One (currently is our clietn app, but should it could be your own!

Although the definition of the solution is simple, we have to work around the issue that up ’til now all our WCF services were defined through configuration and were injected by the spring.net IoC. Usually you can change the location of you app domain configuration by using the following code:

AppDomain.CurrentDomain.SetData(“APP_CONFIG_FILE”,”c:\\ohad.config”);

In theory wth the above code you can redirect the configuration to a new file, and if you use for example:

System.Configuration.ConfigurationSettings.AppSettings["my_setting"]

you will be able to get the value of your new configuration. Unfortunatly, the Spring.Net IoC uses the ConfigurationManager class which ignores that setting… Now what?

Well, re-writting all the code to not use Spring.Net IoC was not an option because it means changing a lot of work and does mean to move from an application where dependencies are injected to one were we have to manually init all the different objects. After some careful though, I move to use a small CLR detail that I knew to make the AppDomain that executed our code to use the users configuration. The trick is the following, use one AppDomain to start the application. This would be a dummy AppDomain that does not execute any code at all but launches a second AppDomain whose configuration is the correct one and which will execute the actual code.

In case I did not make any sense, here is an example code:

usingSystem;usingCanonical.UbuntuOne.Common.Container;usingCanonical.UbuntuOne.Common.Utils;usinglog4net;namespace Canonical.UbuntuOne.ProcessDispatcher{staticclass Program
{privatestaticreadonly ILog _logger = LogManager.GetLogger(typeof(Program));privatestaticreadonly ConfigurationLocator _configLocator =new ConfigurationLocator();/// <summary>/// This method starts the service./// </summary>staticvoid Main(){
_logger.Debug("Redirecting configuration");// Setup information for the new appdomain.
var setup =new AppDomainSetup
{
ConfigurationFile = _configLocator.GetCurrentUserDaemonConfiguration()};// Create the new appdomain with the new config.
var executionAppDomain = AppDomain.CreateDomain("ServicesAppDomain",
AppDomain.CurrentDomain.Evidence, setup);// Call the write config method in that appdomain.
executionAppDomain.DoCallBack(()=>{
_logger.Debug("Starting services.");// use the IoC to get the implementation of the SyncDaemon service, the IoC will take care of // setting the object correctly.
ObjectsContainer.Initialize(new SpringContainer());
var syncDaemonWindowsService =
ObjectsContainer.GetImplementationOf<SyncDaemonWindowsService>();// To run more than one service you have to add them here
syncDaemonWindowsService.Start();while(true);});}}}