3 Study design RNA samples from 20 breast cancer patients tumors. cdna microarrays. Reference design: each tumor sample was compared with pooled mrna from 11 cell lines. Paired data: two samples (marrays) per patient: one before and one after chemotherapy. For the analysis here, a subset of 2998 genes will be used. 3

5 1. Load data. Option 1: Collate If we are going to work with our own data (.CEL or.gpr files) or with data obtained from a database we must import it into the format used by BRB. This can be done following the steps in Array Tools Import data Import wizard In this tutorial we will use an example project that has already been created. 5

6 1. Load data. (2): Existing project A BRB project workbook with the prepared data is available in the sample datasets folder. Its name is "Perou.xls. Load the project and inspect the four worksheets it contains: Experimental Descriptors. Gene Identifiers. Gene Annotations. Filtered Log Ratios. 6

10 Where are the data? By default the data are hidden. You can manage to see some or all clicking the button in the upperleft corner with the legend click to display the data Warning! The button calls one macro in C:\Program Files\ArrayTools\Excel\ But if you are in Spain it has to be changed to C:\Archivos de Programa\ArrayTools\Excel You can do it yourself rightclicking the button and changing this in the Assign Macro option 10

12 2 & 3. Preprocessing steps: Filtering and Normalization After import/loading and before the analyis step data must be pre-processed. This may mean two type of actions: Filtering is done to exclude bad spots or adjust intensities too low or too high to more reasonable values. Normalization is done to correct for biases (systematic errors) due to technical reasons instead of biological variability. 12

13 2: Filtering spots & adjust signals We may filter the data on intensity by excluding values where both the red and green channels are less than 100. We may set the value of an intensity to the minimum in the event only one of the two channel intensities is below the minimum of 100. In addition, we may use the flag column imported with the data, and exclude intensities with a flag value not equal to 1. 13

14 Must we filter the data? Filtering is intended to remove spots whose images or signals were wrong due to different possible reasons Small quantity of cdna in the array Errors during the scanning process Some people prefer not to filter to avoid eliminating good spots unintentionally. In case of doubt be conservative and reduce the filter operation to the minimum. 14

16 3. Normalization A quick inspection of the data -e.g. MA plotswill show if normalization is needed First normalize the data subtracting the median log ratio of an array to all log ratios on that array. Later we will normalize the data by subtracting a non-linear transformation with the loess option. No print-tip group information is available so it is not possible to perform print-tip normalization. We will construct M-A plots to evaluate the results of each normalization option. 16

17 Is normalization necessary? MA plots can show if it is needed to normalize the data (it usually is) To draw an MA-plot go to: Array Tools Plugins M vs A plot Asymetrical clouds, not centered around zero suggest the need for normalization. Symetrical narrow clouds suggest that it can be omitted. 17

21 4. Finding differentially expressed genes Quick fold-change scatter plots can be used to make an inspection of up or down regulated genes in each experiment. Useful to look at specific arrays. Cannot be generalized. The best approach of course is to combine all samples and do a test of DE. 21

23 Comparing visual checks Te list of genes upregulated before and after chemotherapy is not the same for patients 10 and

24 4.2 Class comparison tests A test for differential gene expression between pre and post chemotherapy can be done using a paired t-test. In order to avoid depending on normality assumptions p-values can be computed using a permutation approach. The number and proportion of false discoveries must be controlled. It can also be estimated 24

25 Class comparison: Select test There are several criteria to select genes But only one can be applied each time A threshold based on p- values is used in the example Array Tools Class Comparison Between groups of arrays 25

26 Class comparison: Set options Using permutation test avoids having to do normality assumptions. Global test indicates the probability of selecting the genes finally chosen if there were no real differences. GO obs. vs exp. can be used to find which functional classes appear to be enriched in the set of selected genes Highlights functional relevant classes perhaps related to important biological processes acting on the experiment in this situation. Array Tools Class Comparison Options 26

27 Results The analysis results are written to a file ClassComparison.html It contains Description of the problem Summary of Results Genes which discriminate among classes [Optional] Observed v. Expected' table of GO classes 27

Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information

Hierarchical Clustering Analysis What is Hierarchical Clustering? Hierarchical clustering is used to group similar objects into clusters. In the beginning, each row and/or column is considered a cluster.

Basics of microarrays Petter Mostad 2003 Why microarrays? Microarrays work by hybridizing strands of DNA in a sample against complementary DNA in spots on a chip. Expression analysis measure relative amounts

Quality Assessment of Exon and Gene Arrays I. Introduction In this white paper we describe some quality assessment procedures that are computed from CEL files from Whole Transcript (WT) based arrays such

Tutorial: RMA Analysis using the Microarray Platform Website I Overview Objective of Tutorial This tutorial provides an introduction to data analysis using a data processing method known as RMA (Robust

Guide for Data Visualization and Analysis using ACSN ACSN contains the NaviCell tool box, the intuitive and user- friendly environment for data visualization and analysis. The tool is accessible from the

TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you

INSTRUCTIONS FOR CREATING NGRL (Manchester) MLPA ANALYSIS SPREADSHEETS Background These instructions have been made available in order to allow current users of the MLPA spreadsheets available from download

Technical document The purpose of this document is to help navigate through the major features of this website and act as a basic training manual to enable you to interpret and use the resources and tools

Step by Step Guide to Importing Genetic Data into JMP Genomics Page 1 Introduction Data for genetic analyses can exist in a variety of formats. Before this data can be analyzed it must imported into one

You have made a smart decision in choosing Lab Escape s Heat Map Explorer. Over the next 30 minutes this guide will show you how to analyze your data visually. Your investment in learning to leverage heat

Predictive Gene Signature Selection for Adjuvant Chemotherapy in Non-Small Cell Lung Cancer Patients by Li Liu A practicum report submitted to the Department of Public Health Sciences in conformity with

Model Selection Introduction This user guide provides information about the Partek Model Selection tool. Topics covered include using a Down syndrome data set to demonstrate the usage of the Partek Model

MeDIP-chip service report Wednesday, 20 August, 2008 Sample source: Cells from University of *** Customer: ****** Organization: University of *** Contents of this service report General information and

Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Comparative genomic hybridization Because arrays are more than just a tool for expression analysis Carsten Friis ( with several slides from

IPA 8 Legend This legend provides a key of the main features of Network Explorer and Canonical Pathways, including molecule shapes and colors as well as relationship labels and types. For a high-resolution

Lesson 4.3: Using the VLOOKUP Function Excel 2003 provides two lookup functions that you can use to quickly retrieve information from a table. The functions are called HLOOKUP (horizontal lookup) and VLOOKUP

RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

Tutorial on Using Excel Solver to Analyze Spin-Lattice Relaxation Time Data In the measurement of the Spin-Lattice Relaxation time T 1, a 180 o pulse is followed after a delay time of t with a 90 o pulse,

Step-by-Step Guide to Basic Expression Analysis and Normalization Page 1 Introduction This document shows you how to perform a basic analysis and normalization of your data. A full review of this document

Generating ABI PRISM 7700 Standard Curve Plots in a Spreadsheet Program Overview The goal of this tutorial is to demonstrate the procedure through which analyzed data generated within an ABI PRISM 7700

Spreadsheet software for linear regression analysis Robert Nau Fuqua School of Business, Duke University Copies of these slides together with individual Excel files that demonstrate each program are available

Interpret software User guide version 11 This protocol booklet and its contents are Oxford Gene Technology (Operations) Limited 2008. All rights reserved. Reproduction of all or any substantial part of

Assignment objectives: Regression Pivot table Exercise #1- Simple Linear Regression Often the relationship between two variables, Y and X, can be adequately represented by a simple linear equation of the

What you will do: Explore the features of Excel 2002 Create a blank workbook and a workbook from a template Format a workbook Apply formulas to a workbook Create a chart Import data to a workbook Share

EXCEL 2007 VLOOKUP FOR BUDGET EXAMPLE 1 The primary reports used in the budgeting process, particularly for Financial Review, are the Quarterly Financial Review Reports. These expense and revenue reports

MARKETING ENGINEERING FOR EXCEL TUTORIAL VERSION 1.0.8 Tutorial Segmentation and Classification Marketing Engineering for Excel is a Microsoft Excel add-in. The software runs from within Microsoft Excel

Crosstab Queries A Crosstab Query is a special kind of query that summarizes data by plotting one field against one or more other fields. Crosstab Queries can handle large amounts of data with ease and

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

Microsoft Access 2010 handout Access 2010 is a relational database program you can use to create and manage large quantities of data. You can use Access to manage anything from a home inventory to a giant

LAURA COLOSI Measuring Evaluation Results with Microsoft Excel The purpose of this tutorial is to provide instruction on performing basic functions using Microsoft Excel. Although Excel has the ability

IncuCyte ZOOM Whole Well Imaging Overview With the release of the 2013B software, IncuCyte ZOOM users will have the added ability to image the complete surface of select multi-well plates and 35 mm dishes

Directions for the Well Allocation Deck Upload spreadsheet OGSQL gives users the ability to import Well Allocation Deck information from a text file. The Well Allocation Deck Upload has 3 tabs that must

Developing Key Performance Indicators (KPIs) in Tableau The following tutorial will show you how to create KPIs in Tableau 9. To get started, you will need the following: Tableau version 9 Data: Sample

Using Excel as a Management Reporting Tool with your Minotaur Data with Judith Kirkness These instruction sheets will help you learn: 1. How to export reports from Minotaur to Excel (these instructions

Major Advances in Cancer Prevention, Diagnosis and Treatment~ Why Mesothelioma Leads the Way H. Richard Alexander, Jr., M.D. Department of Surgery and The Greenebaum Cancer Center University of Maryland

Chapter An Introduction to Microarray Data Analysis M. Madan Babu Abstract This chapter aims to provide an introduction to the analysis of gene expression data obtained using microarray experiments. It

The Mascot protein identification program (Matrix Science, Ltd.) uses statistical methods to assess the validity of a match. MS/MS data is not ideal. That is, there are unassignable peaks (noise) and usually

UCL Depthmap 7: Data Analysis Version 7.12.00c Outline Data analysis in Depthmap Although Depthmap is primarily a graph analysis tool, it does allow you to investigate data that you produce. This tutorial

Task Force on Technology EXCEL Basic terminology Spreadsheet A spreadsheet is an electronic document that stores various types of data. There are vertical columns and horizontal rows. A cell is where the

Creating and Using Forms in SharePoint Getting started with custom lists... 1 Creating a custom list... 1 Creating a user-friendly list name... 1 Other options for creating custom lists... 2 Building a