based on our PIO cancer registry I was helping to conduct the above titled article (PubMed-Link).

Abstract

Background: Adjuvant treatment concepts have improved the 10-year cure rate of breast and colon cancer, but new treatments for metastatic disease have yielded only incremental benefit. If treatments for disseminated cancer were actually prolonging life rather than only increasing remission rates, this effect should have been documented over the last 30+ years. However, published data concerning advances in treatment for disseminated cancer have been contradictory.

Patients and Methods: To add data-based information, we analyzed 2 sources: a regional population-based cancer registry (Hamburgisches Krebsregister, HKR), and a research cancer registry (Projektgruppe Internistische Onkologie, PIO). We compared the survival of several thousand patients with metastatic disease who received treatment only after dissemination with that of patients who received initial adjuvant therapy.

Results: After adjuvant treatment, survival in patients with disseminated breast cancer is up to a third shorter than that of patients without adjuvant therapy.

Still searching for the best solution to integrate Kaplan Meier survival estimates into Reporting Services based reports I finally came up with the development of a custom report item. Inspired by and based on the polygons example that is available as a custom report item sample for SQL Server 2008 I put together all my previous efforts in creating Kaplan-Meier Charts and calculating survival estimates.

Objectives

The goal was to implement a custom report item that is able to consume a data set containing an identifier (e.g patient id), a survival period (float), a survival status (bit) and a group label (string) and create a feature-rich Kaplan-Meier Plot. The following picture shows an example giving the overall survival of patients suffering from multiple myeloma grouped by their initial treatment.

The custom report item should have a design-time component as a control that can be used in the Visual Studio Report Designer environment.

Because I was not able to figure out how to create chart series and access the data within the custom report item run-time component using the Chart Class in Microsoft.ReportingServices.OnDemandReportRendering I decided to implement the plotting using the windows forms chart controls. Maybe someone can leave a comment if there is a documentation I overlooked.

Implementation

The custom report item has been developed and tested against SQL Server 2008 R2. My Visual Studio solution contains three projects: One Project for the run-time component, one for the design-time component and a third one that contains all the methods for chart rendering and calculations. Within the design-time component these rendering methods are used to give the user a preview image.

To install the custom report item follow these steps:

Designer Extension:

Copy Assemblies to “%PROGRAMFILES%\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies”. In the same Folder modify “RSReportDesigner.config”

User experience

After adding the KMChart Report Item to your report you will see a preview image that is rendered based on some default data integrated into the chart control. To edit a report using the kmchartdesigner custom control in Visual Studio, you can do any of the following:

Set the properties of the KMChart control in the property browser.

Edit properties through the control’s shortcut menu.

Drag fields onto the drop areas of the control from the fields list.

Summary

Using a custom report item finally turns out as the best solution for implementing advanced reporting needs in SSRS. Although I had a hard time to figure out all the nuts and bolts of a report lifecycle (especially the data part) it is worth the effort to get a reusable chart item for survival analysis. A download of the visual studio solution will be available soon.

.NET Bio the formerly Microsoft Biology Foundation is a open-source language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework. Applications written for this platform can be implemented in a variety of .NET languages, including C#, F#, Visual Basic®, .NET, and IronPython

This release fixes a range of bugs reported in version 1.0 as well as adding new new features such as AB1 and SFF file format parsing, better output and help for command-line tools.

Introduction

Inspired by the use of User-Defined Table Types I was looking for a way to pass a whole result set to a CLR user-defined function. The aim was to let the database server (SQL Server 2012) create a survival plot using the Kaplan-Meier estimator. A table valued parameter seems to be a promising way but unfortunately you are not able to pass a user-defined table type as a table-valued parameter to a managed stored procedure or function executing in the SQL Server process. So, how to deal with the need to pass a complete result set into a user defined function?

Objectives

The idea was to create a CLR User-Defined Aggregate to read the data into a custom “Survival” object that will be passed as parameter into a downstream CLR function to create the survival curve. The resulting chart should be created by using windows forms chart controls. The result of the user-defined aggregate will return a varbinary(max) data type that contains the image.

Our SurvInfo object will hold all the survival times, the survival status and the grouping attribute. Furthermore it will hold the survival probability (s_i) and the number of “patients at risk” (n_i) at timepoint time.

Because our data has to be serialized during aggregation our SurvInfo Class has to be serializable.

An aggregate is created by defining a class tagged with the SqlUserDefinedAggregate attribute. Using the attribute, we are able to define the following options:

Format: Serialization format for the class. This is typically either Native or UserDefined. In case of Native format, the framework handles all the necessary steps to serialize and deserialize the structure. Aggregates using format UserDefined has to implement and handle their own serialization (see below).

IsInvariantToDuplicates (bool): Whether or not duplicate values affect the result

IsInvariantToNulls (bool): Whether or not NULL values affect the result

IsInvariantToOrder (bool): Whether or not the order of the rows affects the result

IsNullIfEmpty (bool): Whether or not an empty result set produces NULL or a value

Name (string): Sets the name of the aggregate

The class itself must contain the following methods:

Init: Init is called when a new group of values is going to be handled using an instance of the class

Accumulate: Each value is passed to the Accumulate method that is responsible for making the necessary calculations. Here we will fill our SurvInfo object

Merge: This method is used to merge the data after the original set of values is divided into several independent groups in parallel operations.

Terminate: Terminate returns the result and contains any further methods based on the aggregated data. Here we will place our method to generate the chart .

Because we are dealing with custom data types we have to implement the IBinarySerialize interface in our custom aggregate. This will require implementing the “Read” and “Write” method. User-defined aggregates must be marked with Format UserDefined and handle their own serialization. We are using the BinaryFormatter class to serialize and deserialize our custom object.

We developed a SQL Server user defined custom aggregate that aggregates multiple columns into a Kaplan-Meier Chart that is returned as binary data. This method overcomes the restriction of passing table valued parameters into a clr enabled function.

The downside of this method is the use of the aggregate itself. Passing parameters that are not of type “column” must be handled as a column that is passed to the user defined aggregate. For shure this solution will run into performance issues when dealing with a huge amount of data.

Today I wanted to find out if a report is used as a subreport. In my project I am heavily using subreports so doing the search manually was not an option. I came up with a solution that queries the rdl-code stored in the ReportServer database to give me a complete List of Reports that are used as subreport or find a specific subreport and its main report.

With SQL-Server 2008 Microsoft introduced the creation of user-defined table types. These table types can be used as parameters, the so called table-valued parameter. Now you have the ability to send multiple rows of data into a stored procedure or function. However, this feature does not apply to CLR User-Defined functions.

I used this new feature to calculate Kaplan Meier survival estimates on a database level. For the calculation you need the observed survival time, the survival status (eg. dead (1) / alive (0)) and in case you want to compare different groups a grouping factor for each observation. This data could be passed as a custom table type into a user defined function that calculates the cumulated survival rate and its variance for each time point.

Last month Microsoft research team announced AzureBlast a case study of a scientific application that was developed for cloud computing. AzureBlast is an massive parallel implementation of the BLAST search engine using the newly released BLAST+ executables.

The general principal to perform parallelization in sequence alignment has not changed with this implementation. Given a number of query sequences, AzureBlast will first split the input sequences into multiple partitions, each of which will be delivered to one worker instance to execute. Once all partitions have been processed, the results will be merged together. For more details see this PDF.

It sounds like a perfect scalable method to support analysis of pyro sequencing results. I think I have to to try it for myself and I will come back to the topic as soon as I have performed some tests.