opm an R package to analyse OmniLog Phenotype MicroArray data

Ähnliche Dokumente

opm an R package to analyse OmniLog Phenotype MicroArray data

Algorithms for graph visualization

Application Note. Import Jinx! Scenes into the DMX-Configurator

Order Ansicht Inhalt

Customer-specific software for autonomous driving and driver assistance (ADAS)

Tube Analyzer LogViewer 2.3

VGM. VGM information. HAMBURG SÜD VGM WEB PORTAL USER GUIDE June 2016

Magic Figures. We note that in the example magic square the numbers 1 9 are used. All three rows (columns) have equal sum, called the magic number.

Java Tools JDK. IDEs. Downloads. Eclipse. IntelliJ. NetBeans. Java SE 8 Java SE 8 Documentation

NEWSLETTER. FileDirector Version 2.5 Novelties. Filing system designer. Filing system in WinClient

Creating OpenSocial Gadgets. Bastian Hofmann

VGM. VGM information. HAMBURG SÜD VGM WEB PORTAL - USER GUIDE June 2016

Exercise (Part XI) Anastasia Mochalova, Lehrstuhl für ABWL und Wirtschaftsinformatik, Kath. Universität Eichstätt-Ingolstadt 1

Funktion der Mindestreserve im Bezug auf die Schlüsselzinssätze der EZB (German Edition)

Number of Maximal Partial Clones

Word-CRM-Upload-Button. User manual

p^db=`oj===pìééçêíáåñçêã~íáçå=

Exercise (Part II) Anastasia Mochalova, Lehrstuhl für ABWL und Wirtschaftsinformatik, Kath. Universität Eichstätt-Ingolstadt 1

Exercise (Part V) Anastasia Mochalova, Lehrstuhl für ABWL und Wirtschaftsinformatik, Kath. Universität Eichstätt-Ingolstadt 1

HIR Method & Tools for Fit Gap analysis

DIBELS TM. German Translations of Administration Directions

iid software tools QuickStartGuide iid USB base driver installation

Seeking for n! Derivatives

p^db=`oj===pìééçêíáåñçêã~íáçå=

FAHRZEUGENTWICKLUNG IM AUTOMOBILBAU FROM HANSER FACHBUCHVERLAG DOWNLOAD EBOOK : FAHRZEUGENTWICKLUNG IM AUTOMOBILBAU FROM HANSER FACHBUCHVERLAG PDF

FACHKUNDE FüR KAUFLEUTE IM GESUNDHEITSWESEN FROM THIEME GEORG VERLAG

JTAGMaps Quick Installation Guide

There are 10 weeks this summer vacation the weeks beginning: June 23, June 30, July 7, July 14, July 21, Jul 28, Aug 4, Aug 11, Aug 18, Aug 25

Level 1 German, 2014

Analysis Add-On Data Lineage

Praktikum Entwicklung Mediensysteme (für Master)

ELBA2 ILIAS TOOLS AS SINGLE APPLICATIONS

Martin Luther. Click here if your download doesn"t start automatically

Was heißt Denken?: Vorlesung Wintersemester 1951/52. [Was bedeutet das alles?] (Reclams Universal-Bibliothek) (German Edition)

Wie man heute die Liebe fürs Leben findet

Titelbild1 ANSYS. Customer Portal LogIn

Weather forecast in Accra

Titelmasterformat Object Generator durch Klicken bearbeiten

Mock Exam Behavioral Finance

Registration of residence at Citizens Office (Bürgerbüro)

Ein Stern in dunkler Nacht Die schoensten Weihnachtsgeschichten. Click here if your download doesn"t start automatically

Level 2 German, 2013

1. General information Login Home Current applications... 3

How-To-Do. OPC-Server with MPI and ISO over TCP/IP Communication. Content. How-To-Do OPC-Server with MPI- und ISO over TCP/IP Communication

XSENSOR PRO V8. New PRO software for pressure testing and measurement. Features of the new version. 1. Batch export of RAW to Calibrated

Englisch-Grundwortschatz

Corporate Digital Learning, How to Get It Right. Learning Café

Finite Difference Method (FDM)

Mitglied der Leibniz-Gemeinschaft

LiLi. physik multimedial. Links to e-learning content for physics, a database of distributed sources

How to access licensed products from providers who are already operating productively in. General Information Shibboleth login...

Rev. Proc Information

Introduction FEM, 1D-Example

Introduction to Python. Introduction. First Steps in Python. pseudo random numbers. May 2016

Einführung in die Finite Element Methode Projekt 2

CNC ZUR STEUERUNG VON WERKZEUGMASCHINEN (GERMAN EDITION) BY TIM ROHR

Flow - der Weg zum Glück: Der Entdecker des Flow-Prinzips erklärt seine Lebensphilosophie (HERDER spektrum) (German Edition)

Konfiguration von eduroam. Configuring eduroam

Handbuch der therapeutischen Seelsorge: Die Seelsorge-Praxis / Gesprächsführung in der Seelsorge (German Edition)

IDRT: Unlocking Research Data Sources with ETL for use in a Structured Research Database

ZWISCHEN TRADITION UND REBELLION - FRAUENBILDER IM AKTUELLEN BOLLYWOODFILM (GERMAN EDITION) BY CHRISTINE STöCKEL

Das Zeitalter der Fünf 3: Götter (German Edition)

Level 2 German, 2015

Attention: Give your answers to problem 1 and problem 2 directly below the questions in the exam question sheet. ,and C = [ ].

Volksgenossinnen: Frauen in der NS- Volksgemeinschaft (Beiträge zur Geschichte des Nationalsozialismus) (German Edition)

Accelerating Information Technology Innovation

Nürnberg und der Christkindlesmarkt: Ein erlebnisreicher Tag in Nürnberg (German Edition)

General info on using shopping carts with Ogone

Cycling and (or?) Trams

EXPERT SURVEY OF THE NEWS MEDIA

Mercedes OM 636: Handbuch und Ersatzteilkatalog (German Edition)

Konkret - der Ratgeber: Die besten Tipps zu Internet, Handy und Co. (German Edition)

"What's in the news? - or: why Angela Merkel is not significant

Die "Badstuben" im Fuggerhaus zu Augsburg

Wer bin ich - und wenn ja wie viele?: Eine philosophische Reise. Click here if your download doesn"t start automatically

Ingenics Project Portal

Open Source. Legal Dos, Don ts and Maybes. openlaws Open Source Workshop 26 June 2015, Federal Chancellery Vienna

Erasmus + STEM For All Seasons

Quick Reference Guide Schnellstart Anleitung

prorm Budget Planning promx GmbH Nordring Nuremberg

Cameraserver mini. commissioning. Ihre Vision ist unsere Aufgabe

Reparaturen kompakt - Küche + Bad: Waschbecken, Fliesen, Spüle, Armaturen, Dunstabzugshaube... (German Edition)

vcdm im Wandel Vorstellung des neuen User Interfaces und Austausch zur Funktionalität V

Fachbereich 5 Wirtschaftswissenschaften Univ.-Prof. Dr. Jan Franke-Viebach

Level 1 German, 2016

BRUUDT Kennzeichenhalter für die Honda NC750X ab 2016 BRUUDT Tail Tidy for the Honda NC750X 2016 and onwards.

Fakultät III Univ.-Prof. Dr. Jan Franke-Viebach

Wissenschaftliche Dienste. Sachstand. Payment of value added tax (VAT) (EZPWD-Anfrage ) 2016 Deutscher Bundestag WD /16

Symbio system requirements. Version 5.1

Exercise (Part I) Anastasia Mochalova, Lehrstuhl für ABWL und Wirtschaftsinformatik, Kath. Universität Eichstätt-Ingolstadt 1

Ressourcenmanagement in Netzwerken SS06 Vorl. 12,

Extract of the Annotations used for Econ 5080 at the University of Utah, with study questions, akmk.pdf.

PONS DIE DREI??? FRAGEZEICHEN, ARCTIC ADVENTURE: ENGLISCH LERNEN MIT JUSTUS, PETER UND BOB

Der Topos Mütterlichkeit am Beispiel Bertolt Brechts "Der kaukasische Kreidekreis" und "Mutter Courage und ihre Kinder" (German Edition)

Killy Literaturlexikon: Autoren Und Werke Des Deutschsprachigen Kulturraumes 2., Vollstandig Uberarbeitete Auflage (German Edition)

TomTom WEBFLEET Tachograph

DIE NEUORGANISATION IM BEREICH DES SGB II AUSWIRKUNGEN AUF DIE ZUSAMMENARBEIT VON BUND LNDERN UND KOMMUNEN

Level 1 German, 2012

Web-Apps mit jquery Mobile: Mobile Multiplattform-Entwicklung mit HTML5 und JavaScript (German Edition)

Was Sie schon immer über Teneriffa wissen wollten: Erklärungen & Wissenswertes, Tipps & Highlights (German Edition)

Transkript:

An introduction to opm opm an R package to analyse OmniLog Phenotype MicroArray data Dr. Johannes Sikorski, Dr. Lea Vaas, Dr. Markus Göker Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH www.dsmz.de

You have numerous OmniLog Phenotype MicroArray data of closely related organisms or cell lines of numerous well-defined mutants obtained under diverse physiological test conditions www.biolog.com and you want to explore them full-fledged and quantitatively into diverse directions of analysis frameworks.

opm: Tools for analysing OmniLog(R) Phenotype Microarray data enables you: to organize your PM data, curve parameters and metadata to subset and query your data graphical display of raw kinetics or aggregated curve parameters exploit the full statistics implemented in R export to third-party software using YAML

Software requirements http://www.r-project.org/ R is a free software environment for statistical computing and graphics. http://www.rstudio.org/ RStudio is a free and open source integrated development environment (IDE) for R. http://cran.r-project.org/web/packages/opm/index.html add-on package opm: Tools for analysing OmniLog(R) Phenotype Microarray data

R Code of this presentation The R code of this presentation is available on request from Dr. Johannes Sikorski Dr. Lea Vaas Dr. Markus Göker johannes.sikorski@dsmz.de l.vaas@cbs.knaw.nl markus.goeker@dsmz.de Feel free to contact us in case of any questions regarding usage of opm.

opm enables you: to organize your PM data, curve parameters and metadata to subset and query your data graphical display of raw kinetics or aggregated curve parameters exploit the full statistics implemented in R export to third-party software using YAML

OPM organizes your PM data in OPMS objects: Example: a set of 9 PM plates of the same plate type intensity Hour Hour 00.00 00.25 00.50. 30.00. 60.00 lysin 35 33 37. 102. 328 per well: raw kinetic data An OPMS object stores: raw kinetic data aggregated curve parameters metadata Plate 1 Plate 2 Plate 3 Plate 4 Plate 5 Plate 6 Plate 7 Plate 8 Plate 9 Plate 1 Plate 2 Plate 3 Plate 4 Plate 5 Plate 6 Plate 7 Plate 8 Plate 9 Plate 1 Plate 2 Plate 3 Plate 4 Plate 5 Plate 6 Plate 7 Plate 8 Plate 9 The size of the OPMS object is only limited by the amount of RAM memory lysin mu 15.559078 lambda 5.798210 A 305.989319 AUC 23308.269348 mu CI95 low 3.803466 lambda CI95 low 1.080333 A CI95 low 305.642353 AUC CI95 low 23125.092442 mu CI95 high 140.841704 lambda CI95 high 11.819251 A CI95 high 306.986123 AUC CI95 high 23411.648024 metadata Plate 3 Taxonomy Bacillus subtilis. habitat soil sampling place GPS coord. sampling date 2011-06-15 sampling season summer habitat [ C] 27. sporulation yes. PCR (gene xyz) positive.... as much and what you wish... per well: aggregated curve parameters, confidenceintervals from bootstrapping per plate: any metadata of interest to the user Lag = lambda, Slope = mu, Max = A, Area Under the Curve = AUC

read_opm() Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH

Load Demo files that come with the opm package read_opm(names, convert = c("try", "no", "yes", "sep", "grp"), gen.iii = FALSE, include = list(),..., demo = FALSE) # Use the built-in function opm_files() to retrieve the paths where the example files in your R installation are located: (files <- opm_files("testdata")) # read in the files, which are zipped # using the include argument to select specific plates of interest # by this, three files are loaded into the object "example.opm" example.opm <- read_opm(files, include = "*Example_?.csv.xz")

Load Demo files that come with the opm package read_opm(names, convert = c("try", "no", "yes", "sep", "grp"), gen.iii = FALSE, include = list(),..., demo = FALSE) # read in all CSV raw data files in your working directory PM1 <- read_opm(".") # read in all CSV raw data files in your working directory and convert the plate type to GenIII plates GenIII <- read_opm(".", gen.iii = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) returns the raw kinetic data wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

Load Demo files that come with the opm package # let us check some information on the files in this OPMS object plates(example.opm) summary(example.opm) show(example.opm) dim(example.opm) hours(example.opm) length(example.opm) max(example.opm) plate_type(example.opm) seq(example.opm) setup_time(example.opm) measurements(example.opm) wells(example.opm) wells(example.opm, full = TRUE)

do_aggr() aggregate only A and AUC using a fast algorithm x <- do_aggr(example.opm, program = "opm-fast") aggregate all 4 parameters using a spline fit algorithm (grofit package) x <- do_aggr(example.opm) include 100x bootstrap replicates x <- do_aggr(example.opm, program = "opm-fast", boot = 100) x <- do_aggr(example.opm, boot = 100) Note: time consuming

check aggregated data aggregated(example.opm) A01 A02 A03 A04 A07 mu 4.242159 5.769109 0.02138581 0.2827407 0.2383062 lambda -2.340620 12.799329-465.46803431 20.0749555-14.4573092 A 47.923185 62.738943 11.51078807 19.4617762 18.2811191 AUC 3914.852139 4154.830048 1070.20657323 1250.9426009 1396.9447154 mu CI95 low 2.733574 3.045267-1.10076311-2.2050686-4.8515830 lambda CI95 low -38.403543-10.300782 56.14216650 42.4248855 24.8184260 A CI95 low 47.197513 58.940763 11.17285004 19.1992801 16.9627344 AUC CI95 low 3875.243148 4093.577722 1056.62986435 1230.3571787 1352.9702303 mu CI95 high 14.170557 13.689212 6.15737265 9.3063345 21.5309783 lambda CI95 high 79.044830 50.248293 87.70587107 106.1197708 107.3697670 A CI95 high 52.484756 67.456369 15.37628753 23.6590936 30.0717055 AUC CI95 high 3941.361758 4183.239559 1077.02925382 1262.9208049 1432.5071603

OPM organizes your PM data in OPMS objects: Example: a set of 9 PM plates of the same plate type intensity Hour Hour 00.00 00.25 00.50. 30.00. 60.00 lysin 35 33 37. 102. 328 per well: raw kinetic data An OPMS object stores: raw kinetic data aggregated curve parameters metadata Plate 1 Plate 2 Plate 3 Plate 4 Plate 5 Plate 6 Plate 7 Plate 8 Plate 9 Plate 1 Plate 2 Plate 3 Plate 4 Plate 5 Plate 6 Plate 7 Plate 8 Plate 9 Plate 1 Plate 2 Plate 3 Plate 4 Plate 5 Plate 6 Plate 7 Plate 8 Plate 9 You need to provide the metadata separately lysin mu 15.559078 lambda 5.798210 A 305.989319 AUC 23308.269348 mu CI95 low 3.803466 lambda CI95 low 1.080333 A CI95 low 305.642353 AUC CI95 low 23125.092442 mu CI95 high 140.841704 lambda CI95 high 11.819251 A CI95 high 306.986123 AUC CI95 high 23411.648024 metadata Plate 3 Taxonomy Bacillus subtilis. habitat soil sampling place GPS coord. sampling date 2011-06-15 sampling season summer habitat [ C] 27. sporulation yes. PCR (gene xyz) positive.... as much and what you wish... per well: aggregated curve parameters, confidenceintervals from bootstrapping per plate: any metadata of interest to the user Lag = lambda, Slope = mu, Max = A, Area Under the Curve = AUC

You need to provide the metadata separately One Problem Arises: Imagine, you have numerous plates with numerous metadata to each plate. How can we make sure that the metadata are matched CORRECTLY to the specific raw kinetic data? Solution: We need an identifier that perfectly matches metadata to raw kinetic data. We use as identifier the Setup Time and Position of the plate in the reader. Good news: opm allows to export these informations as a start for the metadata file using the function: collect_template()

collect_template() data frame add further metadata columns metadata <- collect_template(files, include = "*Example_?.csv.xz") Unique identifier to merge metadata and raw kinetic data

collect_template() data frame CSV file (or *.txt, *.dat) add further metadata columns add further metadata columns in a spreadsheed application collect_template(files, include = "*Example_?.csv.xz", outfile = "template.csv") note the FORMAT: columns are tab separated, fields protected by quotation marks

collect_template() data frame add further metadata columns CSV file (or *.txt, *.dat) add further metadata columns save tab separated and use quotation marks as field protector load file into R environment using to_metadata()

collect_template() data frame add further metadata columns CSV file (or *.txt, *.dat) add further metadata columns save tab separated and use quotation marks as field protector load file into R environment using metadata.example <- to_metadata("template.csv") metadata.example <- to_metadata("template.csv", strip.white = FALSE) metadata.example <- to_metadata("template.csv", sep = ",")

collect_template() data frame add further metadata columns CSV file (or *.txt, *.dat) add further metadata columns further added metadata columns Note: mock metadata for demonstration purpose

include_metadata() data frame with metadata metadata OPMS object with kinetic raw data example.opm example.opm.metadata <- include_metadata(example.opm, md = metadata)

draw kinetic data xy_plot(example.opm)

xy_plot(example.opm) Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH

xy_plot(example.opm, col = c("blue", "red", "green"), lwd = 2)

xy_plot(example.opm, col = c("blue", "red", "green"), lwd = 2, legend.fmt = list(space = "right"))

xy_plot(example.opm, col = c("blue", "red", "green"), lwd = 2, legend.fmt = list(space = "right"), include = c("species", "strain"))

xy_plot(example.opm, col = c("blue", "red", "green"), lwd = 2, legend.fmt = list(space = "right"), include = c("species", "strain")) Modify panel strip, strip text, and legend by using arguments from lattice

xy_plot(example.opm[,, ]) What about drawing only parts? It is possible to plot (1) specific plates, (2) time points, or (3) wells by indexing OPMS objects using square brackets. xy_plot(example.opm[plates, time points, wells])

xy_plot(example.opm[,, ])

xy_plot(example.opm[ 3,, ])

xy_plot(example.opm[ 3, 1:100, ])

xy_plot(example.opm[ 3, 1:100, c("a01", "A02", "E05", "G08", "H10")])

xy_plot(example.opm[ 3, 1:100, c("a01", "A02", "E05", "G08", "H10")])

Heatmaps compare plates on the basis of aggregated curve parameters

The generation of heatmaps includes two steps: (1) Extract the curve parameter values using extract() (2) Create the heatmap using heat_map()

First step: AUC <- extract(example.opm, dataframe = TRUE, as.labels = list("country", "Species", "strain", "town"), subset = "AUC") metadata of interest parameter and values from aggregating the curve parameters

Second step: heat_map(auc, as.labels = c("species", "town"), as.groups = "town", cexrow = 1.2, use.fun = "gplots", main = "nice heatmap", col = topo.colors(120))

heat_map(auc, as.labels = c("species", "town"), as.groups = "town", cexrow = 1.2, use.fun = "gplots", main = "nice heatmap", col = topo.colors(120))

Confidence interval plot Do curves differ significantly in aggregated curve parameters? We make use of the 95% confidence intervals calculated from 100 bootstrap replicates.

xy_plot(example.opm) In which aggregated curve parameters do these curves differ significantly?

xy_plot(example.opm[,,"d10"], include = list("species","town"), neg.ctrl = FALSE)

ci_plot(example.opm[,, c("d10")], as.labels = list("species","town"), subset = "A")

ci_plot(example.opm[,, c("d10")], as.labels = list("species","town"), subset = "AUC")

ci_plot(example.opm[,, c("d10")], as.labels = list("species","town"), subset = "lambda")

xy_plot(example.opm) Do these curves differ in their lag phase? Try yourself

radial_plot(example.opm[,, 5:17], sep = " ", as.labels = c("species", "town"), draw.legend = FALSE, subset = "AUC")

xy_plot(example.opm[plates, time points, wells]) data(vaas_et_al) -114 GenIII plates (run 96 hours) - numerous replicates of - each two strains of Escherichia coli and Pseudomonas aeruginosa, - including aggregated bootstrapped curve parameters and metadata

data(vaas_et_al)