PUBLICATION DATE

2011-07-28

AUTHORS

Scott Boyer, Lars A Carlsson, Pedro Almeida, Jonna C Stålring

TITLE

AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment

ISSUE

1

VOLUME

3

ISSN (print)

N/A

ISSN (electronic)

1758-2946

ABSTRACT

BackgroundMachine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. ResultsThis paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. ConclusionsAZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.

How to use: Click on a object to move its position. Double click to open its homepage. Right click to preview its contents.

Download the RDF metadata as:   json-ld nt turtle xml License info


38 TRIPLES      30 PREDICATES      37 URIs      20 LITERALS

Subject Predicate Object
1 articles:e0274d5d846fafdab417ca260b49f810 sg:abstract Abstract BackgroundMachine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. ResultsThis paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. ConclusionsAZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.
2 sg:articleType OriginalPaper
3 sg:coverYear 2011
4 sg:coverYearMonth 2011-12
5 sg:ddsId BMC1758-2946-3-28
6 sg:ddsIdJournalBrand 13321
7 sg:doi 10.1186/1758-2946-3-28
8 sg:doiLink http://dx.doi.org/10.1186/1758-2946-3-28
9 sg:hasContributingOrganization grid-institutes:grid.418151.8
10 sg:hasContribution contributions:11f5b715285321686791dee5e59f1aac
11 contributions:6fee411ed8f08337e2f193372555b4c7
12 contributions:798a752df64114729ff2c13ac0188532
13 contributions:e21b502f3eb61e1982e3b9eadd202823
14 sg:hasFieldOfResearchCode anzsrc-for:01
15 anzsrc-for:0104
16 anzsrc-for:08
17 anzsrc-for:0801
18 anzsrc-for:0803
19 sg:hasJournal journals:e9ea11faceeb40b85785ab5b36417058
20 sg:hasJournalBrand journal-brands:c0a2b42ad0f8102a7d435514573a4219
21 sg:indexingDatabase Scopus
22 Web of Science
23 sg:isOpenAccess true
24 sg:issnElectronic 1758-2946
25 sg:issue 1
26 sg:language English
27 sg:license http://scigraph.springernature.com/explorer/license/
28 sg:pageEnd 10
29 sg:pageStart 1
30 sg:publicationDate 2011-07-28
31 sg:publicationYear 2011
32 sg:publicationYearMonth 2011-07
33 sg:scigraphId e0274d5d846fafdab417ca260b49f810
34 sg:title AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment
35 sg:volume 3
36 sg:webpage https://link.springer.com/10.1186/1758-2946-3-28
37 rdf:type sg:Article
38 rdfs:label Article: AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment
HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular JSON format for linked data.

curl -H 'Accept: application/ld+json' 'http://scigraph.springernature.com/things/articles/e0274d5d846fafdab417ca260b49f810'

N-Triples is a line-based linked data format ideal for batch operations .

curl -H 'Accept: application/n-triples' 'http://scigraph.springernature.com/things/articles/e0274d5d846fafdab417ca260b49f810'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'http://scigraph.springernature.com/things/articles/e0274d5d846fafdab417ca260b49f810'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'http://scigraph.springernature.com/things/articles/e0274d5d846fafdab417ca260b49f810'






Preview window. Press ESC to close (or click here)


...