publications – andreas karwath

2010

Hardy, Barry J.; Douglas, Nicki; Helma, Christoph; Rautenberg, Micha; Jeliazkova, Nina; Jeliazkov, Vedrin; Nikolova, Ivelina; Benigni, Romualdo; Tcheremenskaia, Olga; Kramer, Stefan; Girschick, Tobias; Buchwald, Fabian; Wicker, Jörg; Karwath, Andreas; Gütlein, Martin; Maunz, Andreas; Sarimveis, Haralambos; Melagraki, Georgia; Afantitis, Antreas; Sopasakis, Pantelis; Gallagher, David; Poroikov, Vladimir; Filimonov, Dmitry; Zakharov, Alexey V.; Lagunin, Alexey; Gloriozova, Tatyana; Novikov, Sergey; Skvortsova, Natalia; Druzhilovsky, Dmitry; Chawla, Sunil; Ghosh, Indira; Ray, Surajit; Patel, Hitesh; Escher, Sylvia

Collaborative development of predictive toxicology applications Journal Article

In: J. Cheminformatics, vol. 2, pp. 7, 2010.

Abstract | Links | BibTeX | Tags: crossvalidation, data mining, QSAR, scientific knowledge, validation

@article{hardy2010,

title = {Collaborative development of predictive toxicology applications},

author = {Barry J. Hardy and Nicki Douglas and Christoph Helma and Micha Rautenberg and Nina Jeliazkova and Vedrin Jeliazkov and Ivelina Nikolova and Romualdo Benigni and Olga Tcheremenskaia and Stefan Kramer and Tobias Girschick and Fabian Buchwald and Jörg Wicker and Andreas Karwath and Martin Gütlein and Andreas Maunz and Haralambos Sarimveis and Georgia Melagraki and Antreas Afantitis and Pantelis Sopasakis and David Gallagher and Vladimir Poroikov and Dmitry Filimonov and Alexey V. Zakharov and Alexey Lagunin and Tatyana Gloriozova and Sergey Novikov and Natalia Skvortsova and Dmitry Druzhilovsky and Sunil Chawla and Indira Ghosh and Surajit Ray and Hitesh Patel and Sylvia Escher},

url = {http://dx.doi.org/10.1186/1758-2946-2-7},

doi = {10.1186/1758-2946-2-7},

year  = {2010},

date = {2010-08-31},

urldate = {2010-08-31},

journal = {J. Cheminformatics},

volume = {2},

pages = {7},

abstract = {OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals.



The OpenTox Framework includes APIs and services for compounds, datasets, features, algorithms, models, ontologies, tasks, validation, and reporting which may be combined into multiple applications satisfying a variety of different user needs. OpenTox applications are based on a set of distributed, interoperable OpenTox API-compliant REST web services. The OpenTox approach to ontology allows for efficient mapping of complementary data coming from different datasets into a unifying structure having a shared terminology and representation.



Two initial OpenTox applications are presented as an illustration of the potential impact of OpenTox for high-quality and consistent structure-activity relationship modelling of REACH-relevant endpoints: ToxPredict which predicts and reports on toxicities for endpoints for an input chemical structure, and ToxCreate which builds and validates a predictive toxicity model based on an input toxicology dataset. Because of the extensible nature of the standardised Framework design, barriers of interoperability between applications and content are removed, as the user may combine data, models and validation from multiple sources in a dependable and time-effective way.},

keywords = {crossvalidation, data mining, QSAR, scientific knowledge, validation},

pubstate = {published},

tppubtype = {article}

}

OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals.

The OpenTox Framework includes APIs and services for compounds, datasets, features, algorithms, models, ontologies, tasks, validation, and reporting which may be combined into multiple applications satisfying a variety of different user needs. OpenTox applications are based on a set of distributed, interoperable OpenTox API-compliant REST web services. The OpenTox approach to ontology allows for efficient mapping of complementary data coming from different datasets into a unifying structure having a shared terminology and representation.

Two initial OpenTox applications are presented as an illustration of the potential impact of OpenTox for high-quality and consistent structure-activity relationship modelling of REACH-relevant endpoints: ToxPredict which predicts and reports on toxicities for endpoints for an input chemical structure, and ToxCreate which builds and validates a predictive toxicity model based on an input toxicology dataset. Because of the extensible nature of the standardised Framework design, barriers of interoperability between applications and content are removed, as the user may combine data, models and validation from multiple sources in a dependable and time-effective way.

2009

Schulz, Hannes; Kersting, Kristian; Karwath, Andreas

ILP, the Blind, and the Elephant: Euclidean Embedding of Co-proven Queries Conference

Inductive Logic Programming, 19th International Conference, ILP 2009, Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2009, ISBN: 978-3-642-13839-3.

Abstract | Links | BibTeX | Tags: cheminformatics, dimensionality reduction, inductive logic programming, relational learning, scientific knowledge, visualization

2008

Karwath, Andreas; Kersting, Kristian; Landwehr, Niels

Boosting Relational Sequence Alignments Conference

The 8th IEEE International Conference on Data Mining, ICDM 2008, IEEE, 2008, ISBN: 978-0-7695-3502-9.

Abstract | Links | BibTeX | Tags: inductive logic programming, machine learning, relational learning, scientific knowledge

Kersting, Kristian; De Raedt, Luc; Gutmann, Bernd; Karwath, Andreas; Landwehr, Niels

Relational Sequence Learning Book Chapter

In: Probabilistic Inductive Logic Programming - Theory and Applications, vol. 4911, pp. 28-55, Springer Verlag, Berlin Heidelberg, Germany, 2008, ISBN: 978-3-540-78651-1.

Abstract | Links | BibTeX | Tags: inductive logic programming, machine learning, relational learning, scientific knowledge

2007

Karwath, Andreas; Kersting, Kristian

Relational Sequence Alignments and Logos Conference

Inductive Logic Programming, 16th International Conference, ILP 2006, vol. 4455, Lecture Notes in Computer Science Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2007, ISBN: 978-3-540-73846-6.

Abstract | Links | BibTeX | Tags: bioinformatics, inductive logic programming, relational learning, scientific knowledge

King, Ross D.; Karwath, Andreas; Clare, Amanda; Dehaspe, Luc

Logic and the Automatic Acquisition of Scientific Knowledge: An Application to Functional Genomics Conference

Computational Discovery of Scientific Knowledge, Introduction, Techniques, and Applications in Environmental and Life Sciences, vol. 4660, Lecture Notes in Computer Science Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2007, ISBN: 978-3-540-73919-7.

Abstract | Links | BibTeX | Tags: bioinformatics, data mining, inductive logic programming, machine learning, relational learning, scientific knowledge

2006

Karwath, Andreas; De Raedt, Luc

SMIREP: Predicting Chemical Activity from SMILES Journal Article

In: Journal of Chemical Information and Modeling, vol. 46, no. 6, pp. 2432 - 2444, 2006.

Abstract | Links | BibTeX | Tags: cheminformatics, graph mining, machine learning, QSAR, relational learning, scientific knowledge

@article{karwath06c,

title = {SMIREP: Predicting Chemical Activity from SMILES},

author = {Andreas Karwath and De Raedt, Luc},

url = {http://pubs.acs.org/doi/abs/10.1021/ci060159g},

doi = {10.1021/ci060159g},

year  = {2006},

date = {2006-10-12},

journal = {Journal of Chemical Information and Modeling},

volume = {46},

number = {6},

pages = {2432 - 2444},

abstract = {Most approaches to structure-activity-relationship (SAR) prediction proceed in two steps. In the first step, a typically large set of fingerprints, or fragments of interest, is constructed (either by hand or by some recent data mining techniques). In the second step, machine learning techniques are applied to obtain a predictive model. The result is often not only a highly accurate but also hard to interpret model. In this paper, we demonstrate the capabilities of a novel SAR algorithm, SMIREP, which tightly integrates the fragment and model generation steps and which yields simple models in the form of a small set of IF-THEN rules. These rules contain SMILES fragments, which are easy to understand to the computational chemist. SMIREP combines ideas from the well-known IREP rule learner with a novel fragmentation algorithm for SMILES strings. SMIREP has been evaluated on three problems: the prediction of binding activities for the estrogen receptor (Environmental Protection Agency's (EPA's) Distributed Structure-Searchable Toxicity (DSSTox) National Center for Toxicological Research estrogen receptor (NCTRER) Database), the prediction of mutagenicity using the carcinogenic potency database (CPDB), and the prediction of biodegradability on a subset of the Environmental Fate Database (EFDB). In these applications, SMIREP has the advantage of producing easily interpretable rules while having predictive accuracies that are comparable to those of alternative state-of-the-art techniques.},

keywords = {cheminformatics, graph mining, machine learning, QSAR, relational learning, scientific knowledge},

pubstate = {published},

tppubtype = {article}

}

Karwath, Andreas; Kersting, Kristian

Relational Sequence Alignments Conference

Proc. The 4th International Workshop on Mining and Learning with Graphs, MLG 2006, % editor = Thomas Gärtner and Gemma C. Garriga and Thorsten Meinl, % month = September, 2006, (workshop).

BibTeX | Tags: bioinformatics, cheminformatics, relational learning, scientific knowledge