File size

File size

File size

Surendra Verma, Development Manager on the Vista Kernel team, digs into Vista's new Transactional File System whith Charles Torre. TxF, as it is referred to internally, is a new kernel construct that is part of an updated Vista NTFS. Surendra provides
a high level overview of TxF in this video. We will continue to look under the hood in future Going Deep episodes with interviews with developers working on the Kernel Transaction Manager, a key component of TxF, which Surendra touches on during the white
board session, which is a Going Deep requirement.

I'd probably have to see the video couple of times to wrap my head around it (this is first real informative video / article on this topic. Had some news articles/blogs mention it but not in great detail).

Just had a burning question - the managed APIs are part of the .net framework (2.0 RTM) correct? There was talk about them being in WinFX which didn't make sense to me (WinFX = Avalon, Indigo, Workflow). Also, are they available only on Vista? (if so, are they
in the framework or the Vista managed SDK)

Any blogs or whitepapers on this subject? Would love to see people talk about real-world usage and places where it makes more sense to use this than to use traditional system calls (given the cost of using transactions).

Would love to see people talk about real-world usage and places where it makes more sense to use this than to use traditional system calls (given the cost of using transactions).

What kind of program would not have any benefit of this? There are certain processes where one should not use a transaction over the whole process (if power fails, you'd not want to undo what you managed to record of ...for example, an important event), but
even inside that process you could use it in more fine grained way, to keep your live video file recording in a consistent state so it would work without any "rebuilding of index" or such if power fails middle of the recording.

The performance is probably fine if you have one of those hybrid hard drives.

In reply to "Just had a burning question - the managed APIs are part of the .net framework (2.0 RTM) correct? There was talk about them being in WinFX which didn't make sense to me (WinFX = Avalon, Indigo, Workflow). Also, are they available only on Vista?
(if so, are they in the framework or the Vista managed SDK)"

That's not WinFX (it's not WCF+WPF+WWF at all). That's part of WinFX. WinFX is the new Microsoft API and as such it's the go to card for general future development. The .NET FX is considered part of the WinFX model as well. This is why, when I teach .NET,
I never say ".NET" I always refer to managed code as "WinFX" because I'm teaching the API in general even though nothing in my WinFX course has ANYTHING to do with WCF, WPF, or WWF. It's still WinFX.

That said, WinfX is different things to different people. For instance to a marketing person WinFX might be WCF, WPF, WWF, but to the architects and developers it's the new world of development. Not a technical idea, or anything like that, but a new model,
a new mindset, and a fully new API for development and architecture.

A journaling file system will ensure that individual metadata changes are atomic (such as creating directory entries, renaming files, etc.). It only applies to one change. If your system crashes, the file structure itself will be correct, but the data
in your files might not be.

A logging file system (terminology is sometimes different), both the metadata and the actual data are journaled. This allows it to make sure the contents of the file are correct, but it still only works on one operation.

A transactional file system (AFAICT for Vista) will do both data and metadata over many modifications. So, you can create and modify many files atomically (either everything is done or nothing is done).

It's nice to see Microsoft moving us out of the stone age. Trying to write safe file-system modification routines on most file systems is tricky (using atomic rename tricks and such) and requires a lot of "check for cleanup" code during startup if you want
to handle it properly. I work on systems that cannot fail (embedded systems where nobody can administrate them), and would really love something like this.

In the interview, you alluded to the tight integration between NTFS and TxF. I am wondering how far this goes? In particular, as hard disks become more intelligent (e.g. native command queuing and hybrid flash/conventional disks), does TxF include special code
to benefit from these features, or does it benefit indirectly, through NTFS?

You talk about the performance cost of transactions...for applications that do not need transactions (not all applications do), can an application not use the transaction features and recoup the performance hit?

You talk about the performance cost of transactions...for applications that do not need transactions (not all applications do), can an application not use the transaction features and recoup the performance hit?

I'm pretty sure Verma said you can still use traditional file operations when you don't need transactional guarantees; both are supported. However, any changes that are committed by another transactional action will show up for any app using non-transactional
operations on the same file. So, use transactional operations if you need a guaranteed state either when reading or writing, otherwise, using traditional file operations are your best for less overhead. Or you can just do what the pros do, and wave a magnet
over the platters in the correct sequence

There is most definitely a performance-hit involved with this (I'm not lucky enough to have one of the forthcoming hybrid disks). I use a Lenovo T60p to do software development for the Jboss application server. The specs (that matter) are:

Core Duo T2500 @ 2.0 GHz2 GB RAM (DDR2-667)100 GB 7200 RPM SATA HDD

Under plain Windows XP, the time to bring the server up and deploy the webapps is about 45 seconds on average. Using Windows Vista (the drivers installed between the two OS's are as comparable as I could get them to me), the exact same process takes 65 seconds
on average. That's almost a 50% increase, and the only real variable between the two is that Win XP is a non-transactional filesystem, while Vista is transactional.

Starting the server and deploying webapps is a highly filesystem-bound operation, as my desktop with the following specs:

>>>>Under plain Windows XP, the time to bring the server up and deploy the webapps is about 45 seconds on average. Using Windows Vista (the drivers installed between the two OS's are as comparable as I could get them to me), the exact same process takes
65 seconds on average. That's almost a 50% increase, and the only real variable between the two is that Win XP is a non-transactional filesystem, while Vista is transactional.

You should be careful when performing performance tests of a release product build and one that is beta. Currently the Vista builds have all sorts of debuggging hooks and monitors and symbols which of course would not be the case in the final build.

We almost met in Sao Paulo on Friday. I recognized your face and planned to ask you a few questions about TxF, but then we didn't meet

1. If you compare the isolation level that TxF provides with the typical isolation level of databases, which would be the best match? Snapshot by any chance?

2. I read that the TxF API changed a lot in RC1. For what I have seen (superficially), it looks a lot less transparent than before. I may be wrong, but it seems that before you could just create an ambient transaction and begin working with files as usual,
and now you have to explicitly invoke transactional versions of the file systems APIs to get transactional behavior. Am I right? Was this design change implemented on purpose or was it because of a roadblock that you had to sacrifice the original simplicity?

3. I would like to understand better the role of KTM and DTC. Does it work similar to promotable transactions in ADO.NET? Is it correct to think of KTM as some kind of lightweight transaction manager that will only delegate to DTC when it needs to coordinate
with resource managers of other kinds?

4. Do you think DTC is too expensive nowadays performance-wise for what it does?

Thank you. And it was nice to see a Channel 9 celebrity with my own eyes

Sorry I missed you in Sao Paulo. I really enjoyed Brazil and Sao Paulo in particular.

You asked some really good questions. Let me try to address them:

1) TxF supports committed read isolation. That means you always get the most recently committed data. When read-only handles are opened in a transaction their "view" is decided at the time of the open based on the "most recently committed rule". All reads on
that handle are set to return that view for as long as the handle is kept open. Other transactions are free to update the file as many times as they like.

When read-only handles are opened outside of any transaction the view is decided the same way, but it's not frozen for the duration of the handle - it's changed if some transaction commits a change. This is designed to be compatible with a non-transaction-aware
applications that (when they share with writers) expect to see changes.

Directory queries ("ls", "dir" etc) always show the most recently committed directory contents as of the time the API is called. Again, the directory can be changed concurrently by other transactions by adding and deleting files.

The main thing that comes out of this is that TxF supports concurrent reads while transactions change contents. So, you could update your web-site in a transaction and the reads will get the old view until commit time. At the time of the commit you'd need to
prevent reads so as not to race with it.

Change notifications and the NTFS USN journal entries are also isolated so that they only show up after commit.

2)

You're right about the change in our APIs in RC1. Previously, you could have an ambient transaction and all your NTFS and registry operations would be a part of the transaction. That was a very easy model to get used to and start programming with! The problem
with it was that it was also easy to accidentally include operations you didn't really intend to include in the transaction. This was particularly true programming at higher ends of the programming stack (COM+ and .Net) since many of those components themselves
use the Windows registry itself, and it was very difficult for a programmer to know when/if lower level registry operations were getting included.

So, we decided to make the model explicit. Now we have versions of about a dozen file APIs and some registry APIs that explicitly take the transaction parameter. This way you know exactly what you're transacting, and are much less likely to get hard to debug
bugs.

We feel that this model is a lot more transparent in its contract and hopefully much less confusing.

We'd love to hear feedback on this. One thing we have heard is that people would like to have the transaction present in the thread environment (but not applied to operations). That way app writers have much less of a burden keeping track of it.

3)

Yes it's similar to the promotable transactions in ADO.NET, except that the lightweight TM is the LTM which directly delegates to KTM when the transaction spans kernel resources only. This way if you start a transaction in .Net and work on a file in NTFS in
that transaction you don't pay for the DTC overhead (process switch). DTC automatically gets enlisted in the transaction if non-kernel resources like SQL get included in the transaction. This happens seamlessly from the application's point of view.

KTM is a fairly lightweight transaction manager and is designed to serve the needs of the kernel resources directly, as well as interface with DTC for broad transaction support.

4)

I'm not quite sure how to characterize DTC's performance. I guess it depends on what you're using it for and what the point of comparison is.

You're welcome! These are good questions. My email address is sverma@microsoft.com. Feel free to send mail if you'd like to converse.

Vista is a very different operating system than XP in many ways. A lot more happens on the OS; for example there're more services in the system, many of them index data. There's also more code in the system for keeping old versions of data around (snapshots)
that is active - it allows you to revert system changes.

Based on our performance measurements the cost of transactions in the filesystem is quite small (under 2% in the worst case) when transactions are not being used.

Sorry I missed you in Sao Paulo. I really enjoyed Brazil and Sao Paulo in particular.

You asked some really good questions. Let me try to address them:

...

Hi again Surendra,

I did not see your answer until Saturday. Thanks for being so kind to give me your address. Unfortunatelly, I am not sure if you are receiving my messages because I am not getting any delivery confirmation. I just hope you are busy celebrating Vista shippment

Well, my email was actually a follow up to this thread, so here it is:

Sorry for trying to understand TxF and TxR from a database/managed developer point of view, but this is where I am coming from after all

Regarding the isolation level you describe, I guess if I had to explain TxF/TxR isolation level to a database geek, I would just tell him: “it’s just like REPEATABLE READS”.

Regarding the change in the APIs, I understand why you had to switch to an explicit model; however I grieve over what you had to give up.

While the problem with the implicit model is that you cannot opt-out of the ambient transaction, with the explicit model you cannot easily get transactional behavior through the “upper” programming stack without rewriting a good share of it.

I am sure that alternatives were carefully considered. However, if it is not abusing your tme, may I ask you about the merits and flaws of an alternate approach that I am thinking of? For sure I am not the only one that thought about this:

1. The original version of the API supports an ambient transaction and you mentioned that you could still expose that transaction handle at the thread level. I am not sure where things landed in the final Vista build.

2. Apparently, to work with any of the managed or unmanaged APIs that could take advantage of transactional semantics, you have to call a function with a file name (or key name for TxR) as a parameter at some point, and in case you are talking to one
of the higher level layers, those will ultimately invoke one of a small set of Win32 name-based APIs.

So, what if you could define a “transactional” moniker that you would need to preppend to the file name or registry key parameter in order to get “implicit” transactional semantics? I can imagine something like “txfile:”, but it could be something else.

I think this would have a few significant effects:

1. The non-transactional semantics of existing code would be preserved even in the presence of an ambient transaction, because hardcoded and existing strings would be lacking the moniker.

2. You would get back the possibility of using transactions with any existing API that takes a file name (or a key name) from Win32 through the .NET Framework, requiring minimal changes.

3. If you included something like this in a future version of Windows, you could still mix explicit, Vista-like calls, with implicit, moniker based calls enlisted in the same transaction.

I understand that this is not completely a clean approach and maybe it there is even something naïve about it. But in any case, it would be nice to learn what you think the tradeoffs are.

Thank you for listening! And congratulations for shipping this cool technology in Vista! It is really something I wish I have had a lot of times.

hi,how could a transactional filesystem handle if the user opens one file twice and modify data, and will save the data at the same time, why it fails? maybe its a good idea to save the sector of the FAT into the RAM if the user opens a file?

I have an application which archives data of a file out to the cloud. Which 1. renders the file offline and 2. leaves a 4K stub on the local file system. In an XP file system modifing properties does not invoke a file recall (done by reading any part of the data). With the transactional changes the file is recalled when view system properties. 2 questions: 1. is viewing system properties of a file read data? 2. Is there a way to disable the trancational model in Vista or Win7?Thanks

Remove this comment

Remove this thread

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation,
please create a new thread in our Forums, or
Contact Us and let us know.