Commit 8b006e2961a82194f3373def2b4d23bed47a1520 - ctdconverter

Created scripts to house core functionality, refactored the Galaxy converter to use these new scripts, updated README files. Luis de la Garza 6 years ago

13 changed file(s) with 1688 addition(s) and 1658 deletion(s). Raw diff Collapse all Expand all

+145

-16

README.md less more

0	0	# CTDConverter
1
2	1	Given one or more CTD files, `CTD2Converter` generates the needed wrappers to include them in workflow engines, such as Galaxy and CWL.
3	2
4	3	## Dependencies
	4	`CTDConverter` has the following python dependencies:
5	5
6		`CTDConverter` relies on [CTDopts]. The dependencies of each of the converters are as follows:
	6	- [CTDopts]
	7	- `lxml`
	8	- `pyyaml`
7	9
8		### Galaxy Converter
9
10		- Generation of Galaxy ToolConfig files relies on `lxml` to generate nice-looking XML files.
11
12		## Installing Dependencies
13		You can install the [CTDopts] and `lxml` modules via `conda`, like so:
	10	### Installing Dependencies
	11	The easiest way is to install [CTDopts] and all required dependencies modules via `conda`, like so:
14	12
15	13	```sh
16		$ conda install lxml
	14	$ conda install lxml pyyaml
17	15	$ conda install -c workflowconversion ctdopts
18	16	```
19	17
20		Note that the [CTDopts] module is available on the `workflowconversion` channel.
	18	Note that [CTDopts] is a python module available on the `workflowconversion` channel. Of course, you can just download [CTDopts] and make it available through your `PYTHONPATH` environment variable. To get more information about how to install python modules, visit: https://docs.python.org/2/install/.
21	19
22		Of course, you can just download [CTDopts] and make it available through your `PYTHONPATH` environment variable. To get more information about how to install python modules, visit: https://docs.python.org/2/install/.
	20	### Issues with `libxml2` and Schema Validation
	21	`lxml` depends on `libxml2`. When you install `lxml` you'll get the latest version of `libxml2` (2.9.4) by default. You would usually want the latest version, but there is, however, a bug in validating XML files against a schema in this version of `libxml2`.
	22
	23	If you require validation of input CTDs against a schema (which we recommend), you will need to downgrade to the latest known version of `libxml2` that works, namely, 2.9.2. You can do it by executing the following command after you've installed all other dependencies:
	24
	25	```sh
	26	$ conda install -y libxml2=2.9.2
	27	```
	28
	29	You will be warned that this command will downgrade some packages, which is fine, don't worry. The `-y` flag tells `conda` to perform the installation without confirmation.
	30
	31	## How to install `CTDConverter`
	32	`CTDConverter` is not a python module, rather, a series of scripts, so installing it is as easy as downloading the source code from https://github.com/genericworkflownodes/CTDConverter.
	33
	34	## Usage
	35	The first thing that you need to tell `CTDConverter` is the output format of the converted wrappers. `CTDConverter` supports conversion of CTDs into Galaxy and CWL. Invoking it is as simple as follows:
	36
	37	$ python convert.py [FORMAT] [ADDITIONAL_PARAMETERS ...]
	38
	39	Here `[FORMAT]` can be any of the supported formats (i.e., `cwl`, `galaxy`). `CTDConverter` offers a series of format-specific scripts and we've designed these scripts to behave somewhat similarly. All converter scripts have the same core functionality, that is, read CTD files, parse them using [CTDopts], validate against a schema, etc. Of course, each converter script might add extra functionality that is not present in other engines, for instance, only the Galaxy converter script supports generation of a `tool_conf.xml` file.
	40
	41	The following sections in this file describe the parameters that all converter scripts share.
	42
	43	Please refer to the detailed documentation for each of the converters for more information:
	44
	45	- [Generation of Galaxy ToolConfig files](galaxy/README.md)
	46	- [Generation of CWL task files](cwl/README.md)
23	47
24	48
25		## How to install CTDConverter
	49	## Converting a single CTD
	50	In its simplest form, the converter takes an input CTD file and generates an output file. The following usage of `CTDConverter`:
26	51
27		1. Download the source code from https://github.com/genericworkflownodes/CTDConverter.
	52	$ python convert.py [FORMAT] -i /data/sample_input.ctd -o /data/sample_output.xml
28	53
29		## Usage
	54	will parse `/data/sample_input.ctd` and generate an appropriate converted file under `/data/sample_output.xml`. The generated file can be added to your workflow engine as usual.
30	55
31		Check the detailed documentation for each of the converters:
	56	## Converting several CTDs
	57	When converting several CTDs, the expected value for the `-o`/`--output` parameter is a folder. For example:
32	58
33		- [Generation of Galaxy ToolConfig files](galaxy/README.md)
	59	$ python convert.py [FORMAT] -i /data/ctds/one.ctd /data/ctds/two.ctd -o /data/converted-files
	60
	61	Will convert `/data/ctds/one.ctd` into `/data/converted-files/one.[EXT]` and `/data/ctds/two.ctd` into `/data/converted-files/two.[EXT]`. Each converter has a preferred extension, here shown as a variable (`[EXT]`). Galaxy prefers `xml`, while CWL prefers `cwl`.
	62
	63	You can use wildcard expansion, as supported by most modern operating systems:
	64
	65	$ python convert.py [FORMAT] -i /data/ctds/*.ctd -o /data/converted-files
	66
	67	## Common Parameters
	68	### Input File(s)
	69	* Purpose: Provide input CTD file(s) to convert.
	70	* Short/long version: `-i` / `--input`
	71	* Required: yes.
	72	* Taken values: a list of input CTD files.
34	73
	74	Example:
	75
	76	Any of the following invocations will convert `/data/input_one.ctd` and `/data/input_two.ctd`:
	77
	78	$ python convert.py [FORMAT] -i /data/input_one.ctd -i /data/input_two.ctd -o /data/generated
	79	$ python convert.py [FORMAT] -i /data/input_one.ctd /data/input_two.ctd -o /data/generated
	80	$ python convert.py [FORMAT] --input /data/input_one.ctd /data/input_two.ctd -o /data/generated
	81	$ python convert.py [FORMAT] --input /data/input_one.ctd --input /data/input_two.ctd -o /data/generated
	82
	83	The following invocation will convert `/data/input.ctd` into `/data/output.xml`:
	84
	85	$ python convert.py [FORMAT] -i /data/input.ctd -o /data/output.xml
	86
	87	Of course, you can also use wildcards, which will be automatically expanded by any modern operating system. This is extremely useful if you want to convert several files at a time. Let's assume that the folder `/data/ctds` contains three files: `input_one.ctd`, `input_two.ctd` and `input_three.ctd`. The following two invocations will produce the same output in the `/data/wrappers` folder:
	88
	89	$ python convert.py [FORMAT] -i /data/input_one.ctd /data/input_two.ctd /data/input_three.ctd -o /data/wrappers
	90	$ python convert.py [FORMAT] -i /data/*.ctd -o /data/wrappers
	91
	92	### Output Destination
	93	* Purpose: Provide output destination for the converted wrapper files.
	94	* Short/long version: `-o` / `--output-destination`
	95	* Required: yes.
	96	* Taken values: if a single input file is given, then a single output file is expected. If multiple input files are given, then an existent folder, in which all converted CTDs will be written, is expected.
	97
	98	Examples:
	99
	100	A single input is given, and the output will be generated into `/data/output.xml`:
	101
	102	$ python convert.py [FORMAT] -i /data/input.ctd -o /data/output.xml
	103
	104	Several inputs are given. The output is the already existent folder, `/data/wrappers`, and at the end of the operation, the files `/data/wrappers/input_one.[EXT]` and `/data/wrappers/input_two.[EXT]` will be generated:
	105
	106	$ python convert.py [FORMAT] -i /data/ctds/input_one.ctd /data/ctds/input_two.ctd -o /data/stubs
	107
	108	### Blacklisting Parameters
	109	* Purpose: Some parameters present in the CTD are not to be exposed on the output files. Think of parameters such as `--help`, `--debug` that might won't make much sense to be exposed to final users in a workflow management system.
	110	* Short/long version: `-b` / `--blacklist-parameters`
	111	* Required: no.
	112	* Taken values: A list of parameters to be blacklisted.
	113
	114	Example:
	115
	116	$ pythonconvert.py [FORMAT] ... -b h help quiet
	117
	118	In this case, `CTDConverter` will not process any of the parameters named `h`, `help`, or `quiet`, that is, they will not appear in the generated output files.
	119
	120	### Schema Validation
	121	* Purpose: Provide validation of input CTDs against a schema file (i.e, a XSD file).
	122	* Short/long version: `-V` / `--validation-schema`
	123	* Required: no.
	124	* Taken values: location of the schema file (e.g., CTD.xsd).
	125
	126	CTDs can be validated against a schema. The master version of the schema can be found under [CTDSchema].
	127
	128	If a schema is provided, all input CTDs will be validated against it.
	129
	130	NOTE: Please make sure to read the [section on issues with schema validation](#issues-with-libxml2-and-schema-validation) if you require validation of CTDs against a schema.
	131
	132	### Hardcoding Parameters
	133	* Purpose: Fixing the value of a parameter and hide it from the end user.
	134	* Short/long version: `-p` / `--hardcoded-parameters`
	135	* Required: no.
	136	* Taken values: The path of a file containing the mapping between parameter names and hardcoded values to use.
	137
	138	It is sometimes required that parameters are hidden from the end user in workflow systems and that they take a predetermined, fixed value. Allowing end users to control parameters similar to `--verbosity`, `--threads`, etc., might create more problems than solving them. For this purpose, the parameter `p`/`--hardcoded-parameters` takes the path of a file that contains up to three columns separated by whitespace that map parameter names to the hardcoded value. The first column contains the name of the parameter and the second one the hardcoded value. The first two columns are mandatory.
	139
	140	If the parameter is to be hardcoded only for certain tools, a third column containing a comma separated list of tool names for which the hardcoding will apply can be added.
	141
	142	Lines starting with `#` will be ignored. The following is an example of a valid file:
	143
	144	# Parameter name # Value # Tool(s)
	145	threads 8
	146	mode quiet
	147	xtandem_executable xtandem XTandemAdapter
	148	verbosity high Foo, Bar
	149
	150	The parameters `threads` and `mode` will be set to `8` and `quiet`, respectively, for all parsed CTDs. However, the `xtandem_executable` parameter will be set to `xtandem` only for the `XTandemAdapter` tool. Similarly, the parameter `verbosity` will be set to `high` for the `Foo` and `Bar` tools only.
	151
	152	### Providing a default executable Path
	153	* Purpose: Help workflow engines locate tools by providing a path.
	154	* Short/long version: `-x` / `--default-executable-path`
	155	* Required: no.
	156	* Taken values: The default executable path of the tools in the target workflow engine.
	157
	158	CTDs can contain an `<executablePath>` element that will be used when executing the tool binary. If this element is missing, the value provided by this parameter will be used as a prefix when building the appropriate sections in the output files.
	159
	160	The following invocation of the converter will use `/opt/suite/bin` as a prefix when providing the executable path in the output files for any input CTD that lacks the `<executablePath>` section:
	161
	162	$ python convert.py [FORMAT] -x /opt/suite/bin ...
	163
35	164
36	165	[CTDopts]: https://github.com/genericworkflownodes/CTDopts

-0

common/__init__.py less more

(New empty file)

+45

-0

common/exceptions.py less more

	0	#!/usr/bin/env python
	1	# encoding: utf-8
	2
	3	"""
	4	@author: delagarza
	5	"""
	6
	7	from CTDopts.CTDopts import ModelError
	8
	9
	10	class CLIError(Exception):
	11	# Generic exception to raise and log different fatal errors.
	12	def __init__(self, msg):
	13	super(CLIError).__init__(type(self))
	14	self.msg = "E: %s" % msg
	15
	16	def __str__(self):
	17	return self.msg
	18
	19	def __unicode__(self):
	20	return self.msg
	21
	22
	23	class InvalidModelException(ModelError):
	24	def __init__(self, message):
	25	super(InvalidModelException, self).__init__()
	26	self.message = message
	27
	28	def __str__(self):
	29	return self.message
	30
	31	def __repr__(self):
	32	return self.message
	33
	34
	35	class ApplicationException(Exception):
	36	def __init__(self, msg):
	37	super(ApplicationException).__init__(type(self))
	38	self.msg = msg
	39
	40	def __str__(self):
	41	return self.msg
	42
	43	def __unicode__(self):
	44	return self.msg⏎

+23

-0

common/logger.py less more

	0	#!/usr/bin/env python
	1	# encoding: utf-8
	2	import sys
	3
	4	MESSAGE_INDENTATION_INCREMENT = 2
	5
	6
	7	def _get_indented_text(text, indentation_level):
	8	return ("%(indentation)s%(text)s" %
	9	{"indentation": " " * (MESSAGE_INDENTATION_INCREMENT * indentation_level),
	10	"text": text})
	11
	12
	13	def warning(warning_text, indentation_level=0):
	14	sys.stdout.write(_get_indented_text("WARNING: %s\n" % warning_text, indentation_level))
	15
	16
	17	def error(error_text, indentation_level=0):
	18	sys.stderr.write(_get_indented_text("ERROR: %s\n" % error_text, indentation_level))
	19
	20
	21	def info(info_text, indentation_level=0):
	22	sys.stdout.write(_get_indented_text("INFO: %s\n" % info_text, indentation_level))

+194

-0

common/utils.py less more

	0	#!/usr/bin/env python
	1	# encoding: utf-8
	2	import ntpath
	3	import os
	4
	5	from lxml import etree
	6	from string import strip
	7	from logger import info, error, warning
	8
	9	from common.exceptions import ApplicationException
	10	from CTDopts.CTDopts import CTDModel
	11
	12
	13	MESSAGE_INDENTATION_INCREMENT = 2
	14
	15
	16	# simple struct-class containing a tuple with input/output location and the in-memory CTDModel
	17	class ParsedCTD:
	18	def __init__(self, ctd_model=None, input_file=None, suggested_output_file=None):
	19	self.ctd_model = ctd_model
	20	self.input_file = input_file
	21	self.suggested_output_file = suggested_output_file
	22
	23
	24	class ParameterHardcoder:
	25	def __init__(self):
	26	# map whose keys are the composite names of tools and parameters in the following pattern:
	27	# [ToolName][separator][ParameterName] -> HardcodedValue
	28	# if the parameter applies to all tools, then the following pattern is used:
	29	# [ParameterName] -> HardcodedValue
	30
	31	# examples (assuming separator is '#'):
	32	# threads -> 24
	33	# XtandemAdapter#adapter -> xtandem.exe
	34	# adapter -> adapter.exe
	35	self.separator = "!"
	36	self.parameter_map = {}
	37
	38	# the most specific value will be returned in case of overlap
	39	def get_hardcoded_value(self, parameter_name, tool_name):
	40	# look for the value that would apply for all tools
	41	generic_value = self.parameter_map.get(parameter_name, None)
	42	specific_value = self.parameter_map.get(self.build_key(parameter_name, tool_name), None)
	43	if specific_value is not None:
	44	return specific_value
	45
	46	return generic_value
	47
	48	def register_parameter(self, parameter_name, parameter_value, tool_name=None):
	49	self.parameter_map[self.build_key(parameter_name, tool_name)] = parameter_value
	50
	51	def build_key(self, parameter_name, tool_name):
	52	if tool_name is None:
	53	return parameter_name
	54	return "%s%s%s" % (parameter_name, self.separator, tool_name)
	55
	56
	57	def validate_path_exists(path):
	58	if not os.path.isfile(path) or not os.path.exists(path):
	59	raise ApplicationException("The provided path (%s) does not exist or is not a valid file path." % path)
	60
	61
	62	def validate_argument_is_directory(args, argument_name):
	63	file_name = getattr(args, argument_name)
	64	if file_name is not None and os.path.isdir(file_name):
	65	raise ApplicationException("The provided output file name (%s) points to a directory." % file_name)
	66
	67
	68	def validate_argument_is_valid_path(args, argument_name):
	69	paths_to_check = []
	70	# check if we are handling a single file or a list of files
	71	member_value = getattr(args, argument_name)
	72	if member_value is not None:
	73	if isinstance(member_value, list):
	74	for file_name in member_value:
	75	paths_to_check.append(strip(str(file_name)))
	76	else:
	77	paths_to_check.append(strip(str(member_value)))
	78
	79	for path_to_check in paths_to_check:
	80	validate_path_exists(path_to_check)
	81
	82
	83	# taken from
	84	# http://stackoverflow.com/questions/8384737/python-extract-file-name-from-path-no-matter-what-the-os-path-format
	85	def get_filename(path):
	86	head, tail = ntpath.split(path)
	87	return tail or ntpath.basename(head)
	88
	89
	90	def get_filename_without_suffix(path):
	91	root, ext = os.path.splitext(os.path.basename(path))
	92	return root
	93
	94
	95	def parse_input_ctds(xsd_location, input_ctds, output_destination, output_file_extension):
	96	is_converting_multiple_ctds = len(input_ctds) > 1
	97	parsed_ctds = []
	98	schema = None
	99	if xsd_location is not None:
	100	try:
	101	info("Loading validation schema from %s" % xsd_location, 0)
	102	schema = etree.XMLSchema(etree.parse(xsd_location))
	103	except Exception, e:
	104	error("Could not load validation schema %s. Reason: %s" % (xsd_location, str(e)), 0)
	105	else:
	106	info("Validation against a schema has not been enabled.", 0)
	107	for input_ctd in input_ctds:
	108	try:
	109	if schema is not None:
	110	validate_against_schema(input_ctd, schema)
	111	output_file = output_destination
	112	# if multiple inputs are being converted, we need to generate a different output_file for each input
	113	if is_converting_multiple_ctds:
	114	output_file = os.path.join(output_file,
	115	get_filename_without_suffix(input_ctd) + '.' + output_file_extension)
	116	parsed_ctds.append(ParsedCTD(CTDModel(from_file=input_ctd), input_ctd, output_file))
	117	except Exception, e:
	118	error(str(e), 1)
	119	continue
	120	return parsed_ctds
	121
	122
	123	def flatten_list_of_lists(args, list_name):
	124	setattr(args, list_name, [item for sub_list in getattr(args, list_name) for item in sub_list])
	125
	126
	127	def validate_against_schema(ctd_file, schema):
	128	try:
	129	parser = etree.XMLParser(schema=schema)
	130	etree.parse(ctd_file, parser=parser)
	131	except etree.XMLSyntaxError, e:
	132	raise ApplicationException("Invalid CTD file %s. Reason: %s" % (ctd_file, str(e)))
	133
	134
	135	def add_common_parameters(parser, version, last_updated):
	136	parser.add_argument("FORMAT", default=None, help="Output format (mandatory). Can be one of: cwl, galaxy.")
	137	parser.add_argument("-i", "--input", dest="input_files", default=[], required=True, nargs="+", action="append",
	138	help="List of CTD files to convert.")
	139	parser.add_argument("-o", "--output-destination", dest="output_destination", required=True,
	140	help="If multiple input files are given, then a folder in which all converted "
	141	"files will be generated is expected; "
	142	"if a single input file is given, then a destination file is expected.")
	143	parser.add_argument("-x", "--default-executable-path", dest="default_executable_path",
	144	help="Use this executable path when <executablePath> is not present in the CTD",
	145	default=None, required=False)
	146	parser.add_argument("-b", "--blacklist-parameters", dest="blacklisted_parameters", default=[], nargs="+",
	147	action="append",
	148	help="List of parameters that will be ignored and won't appear on the galaxy stub",
	149	required=False)
	150	parser.add_argument("-p", "--hardcoded-parameters", dest="hardcoded_parameters", default=None, required=False,
	151	help="File containing hardcoded values for the given parameters. Run with '-h' or '--help' "
	152	"to see a brief example on the format of this file.")
	153	parser.add_argument("-V", "--validation-schema", dest="xsd_location", default=None, required=False,
	154	help="Location of the schema to use to validate CTDs. If not provided, no schema validation "
	155	"will take place.")
	156
	157	# TODO: add verbosity, maybe?
	158	program_version = "v%s" % version
	159	program_build_date = str(last_updated)
	160	program_version_message = '%%(prog)s %s (%s)' % (program_version, program_build_date)
	161	parser.add_argument("-v", "--version", action='version', version=program_version_message)
	162
	163
	164	def parse_hardcoded_parameters(hardcoded_parameters_file):
	165	parameter_hardcoder = ParameterHardcoder()
	166	if hardcoded_parameters_file is not None:
	167	line_number = 0
	168	with open(hardcoded_parameters_file) as f:
	169	for line in f:
	170	line_number += 1
	171	if line is None or not line.strip() or line.strip().startswith("#"):
	172	pass
	173	else:
	174	# the third column must not be obtained as a whole, and not split
	175	parsed_hardcoded_parameter = line.strip().split(None, 2)
	176	# valid lines contain two or three columns
	177	if len(parsed_hardcoded_parameter) != 2 and len(parsed_hardcoded_parameter) != 3:
	178	warning("Invalid line at line number %d of the given hardcoded parameters file. Line will be"
	179	"ignored:\n%s" % (line_number, line), 0)
	180	continue
	181
	182	parameter_name = parsed_hardcoded_parameter[0]
	183	hardcoded_value = parsed_hardcoded_parameter[1]
	184	tool_names = None
	185	if len(parsed_hardcoded_parameter) == 3:
	186	tool_names = parsed_hardcoded_parameter[2].split(',')
	187	if tool_names:
	188	for tool_name in tool_names:
	189	parameter_hardcoder.register_parameter(parameter_name, hardcoded_value, tool_name.strip())
	190	else:
	191	parameter_hardcoder.register_parameter(parameter_name, hardcoded_value)
	192
	193	return parameter_hardcoder

+265

-0

convert.py less more

	0	import os
	1	import sys
	2	import traceback
	3	import common.utils as utils
	4
	5	from argparse import ArgumentParser
	6	from argparse import RawDescriptionHelpFormatter
	7	from common.exceptions import ApplicationException, ModelError
	8
	9
	10	__all__ = []
	11	__version__ = 2.0
	12	__date__ = '2014-09-17'
	13	__updated__ = '2017-08-09'
	14
	15	program_version = "v%s" % __version__
	16	program_build_date = str(__updated__)
	17	program_version_message = '%%(prog)s %s (%s)' % (program_version, program_build_date)
	18	program_short_description = "CTDConverter - A project from the WorkflowConversion family " \
	19	"(https://github.com/WorkflowConversion/CTDConverter)"
	20	program_usage = '''
	21	USAGE:
	22
	23	$ python convert.py [FORMAT] [ARGUMENTS ...]
	24
	25	FORMAT can be either one of the supported output formats: cwl, galaxy.
	26
	27	There is one converter for each supported FORMAT, each taking a different set of arguments. Please consult the detailed
	28	documentation for each of the converters. Nevertheless, all converters have the following common parameters/options:
	29
	30
	31	I - Parsing a single CTD file and convert it:
	32
	33	$ python convert.py [FORMAT] -i [INPUT_FILE] -o [OUTPUT_FILE]
	34
	35
	36	II - Parsing several CTD files, output converted wrappers in a given folder:
	37
	38	$ python converter.py [FORMAT] -i [INPUT_FILES] -o [OUTPUT_DIRECTORY]
	39
	40
	41	III - Hardcoding parameters
	42
	43	It is possible to hardcode parameters. This makes sense if you want to set a tool in 'quiet' mode or if your tools
	44	support multi-threading and accept the number of threads via a parameter, without giving end users the chance to
	45	change the values for these parameters.
	46
	47	In order to generate hardcoded parameters, you need to provide a simple file. Each line of this file contains
	48	two or three columns separated by whitespace. Any line starting with a '#' will be ignored. The first column contains
	49	the name of the parameter, the second column contains the value that will always be set for this parameter. Only the
	50	first two columns are mandatory.
	51
	52	If the parameter is to be hardcoded only for a set of tools, then a third column can be added. This column contains
	53	a comma-separated list of tool names for which the parameter will be hardcoded. If a third column is not present,
	54	then all processed tools containing the given parameter will get a hardcoded value for it.
	55
	56	The following is an example of a valid file:
	57
	58	##################################### HARDCODED PARAMETERS example #####################################
	59	# Every line starting with a # will be handled as a comment and will not be parsed.
	60	# The first column is the name of the parameter and the second column is the value that will be used.
	61
	62	# Parameter name # Value # Tool(s)
	63	threads 8
	64	mode quiet
	65	xtandem_executable xtandem XTandemAdapter
	66	verbosity high Foo, Bar
	67
	68	#########################################################################################################
	69
	70	Using the above file will produce a command-line similar to:
	71
	72	[TOOL] ... -threads 8 -mode quiet ...
	73
	74	for all tools. For XTandemAdapter, however, the command-line will look like:
	75
	76	XtandemAdapter ... -threads 8 -mode quiet -xtandem_executable xtandem ...
	77
	78	And for tools Foo and Bar, the command-line will be similar to:
	79
	80	Foo -threads 8 -mode quiet -verbosity high ...
	81
	82
	83	IV - Engine-specific parameters
	84
	85	i - Galaxy
	86
	87	a. Providing file formats, mimetypes
	88
	89	Galaxy supports the concept of file format in order to connect compatible ports, that is, input ports of a
	90	certain data format will be able to receive data from a port from the same format. This converter allows you
	91	to provide a personalized file in which you can relate the CTD data formats with supported Galaxy data formats.
	92	The layout of this file consists of lines, each of either one or four columns separated by any amount of
	93	whitespace. The content of each column is as follows:
	94
	95	* 1st column: file extension
	96	* 2nd column: data type, as listed in Galaxy
	97	* 3rd column: full-named Galaxy data type, as it will appear on datatypes_conf.xml
	98	* 4th column: mimetype (optional)
	99
	100	The following is an example of a valid "file formats" file:
	101
	102	########################################## FILE FORMATS example ##########################################
	103	# Every line starting with a # will be handled as a comment and will not be parsed.
	104	# The first column is the file format as given in the CTD and second column is the Galaxy data format. The
	105	# second, third, fourth and fifth columns can be left empty if the data type has already been registered
	106	# in Galaxy, otherwise, all but the mimetype must be provided.
	107
	108	# CTD type # Galaxy type # Long Galaxy data type # Mimetype
	109	csv tabular galaxy.datatypes.data:Text
	110	fasta
	111	ini txt galaxy.datatypes.data:Text
	112	txt
	113	idxml txt galaxy.datatypes.xml:GenericXml application/xml
	114	options txt galaxy.datatypes.data:Text
	115	grid grid galaxy.datatypes.data:Grid
	116	##########################################################################################################
	117
	118	Note that each line consists precisely of either one, three or four columns. In the case of data types already
	119	registered in Galaxy (such as fasta and txt in the above example), only the first column is needed. In the
	120	case of data types that haven't been yet registered in Galaxy, the first three columns are needed
	121	(mimetype is optional).
	122
	123	For information about Galaxy data types and subclasses, see the following page:
	124	https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
	125
	126
	127	b. Finer control over which tools will be converted
	128
	129	Sometimes only a subset of CTDs needs to be converted. It is possible to either explicitly specify which tools
	130	will be converted or which tools will not be converted.
	131
	132	The value of the -s/--skip-tools parameter is a file in which each line will be interpreted as the name of a
	133	tool that will not be converted. Conversely, the value of the -r/--required-tools is a file in which each line
	134	will be interpreted as a tool that is required. Only one of these parameters can be specified at a given time.
	135
	136	The format of both files is exactly the same. As stated before, each line will be interpreted as the name of a
	137	tool. Any line starting with a '#' will be ignored.
	138
	139
	140	ii - CWL
	141
	142	There are, for now, no CWL-specific parameters or options.
	143
	144	'''
	145
	146	program_license = '''%(short_description)s
	147
	148	Copyright 2017, WorklfowConversion
	149
	150	Licensed under the Apache License, Version 2.0 (the "License");
	151	you may not use this file except in compliance with the License.
	152	You may obtain a copy of the License at
	153
	154	http://www.apache.org/licenses/LICENSE-2.0
	155
	156	Unless required by applicable law or agreed to in writing, software
	157	distributed under the License is distributed on an "AS IS" BASIS,
	158	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	159	See the License for the specific language governing permissions and
	160	limitations under the License.
	161
	162	%(usage)s
	163	''' % {'short_description': program_short_description, 'usage': program_usage}
	164
	165
	166	def main(argv=None):
	167	if argv is None:
	168	argv = sys.argv
	169	else:
	170	sys.argv.extend(argv)
	171
	172	# check that we have, at least, one argument provided
	173	# at this point we cannot parse the arguments, because each converter takes different arguments, meaning each
	174	# converter will register its own parameters after we've registered the basic ones... we have to do it old school
	175	if len(argv) < 2:
	176	utils.error('Not enough arguments provided')
	177	print('\nUsage: $ python convert.py [TARGET] [ARGUMENTS]\n\n' +
	178	'Where:\n' +
	179	' target: one of \'cwl\' or \'galaxy\'\n\n' +
	180	'Run again using the -h/--help option to print more detailed help.\n')
	181	return 1
	182
	183	# TODO: at some point this should look like real software engineering and use a map containing converter instances
	184	# whose keys would be the name of the converter (e.g., cwl, galaxy), but for the time being, only two formats
	185	# are supported
	186	target = str.lower(argv[1])
	187	if target == 'cwl':
	188	from cwl import converter
	189	elif target == 'galaxy':
	190	from galaxy import converter
	191	elif target == '-h' or target == '--help' or target == '--h' or target == 'help':
	192	print(program_license)
	193	return 0
	194	else:
	195	utils.error('Unrecognized target engine. Supported targets are \'cwl\' and \'galaxy\'.')
	196	return 1
	197
	198	try:
	199	# Setup argument parser
	200	parser = ArgumentParser(prog="CTDConverter", description=program_license,
	201	formatter_class=RawDescriptionHelpFormatter, add_help=True)
	202	utils.add_common_parameters(parser, program_version_message, program_build_date)
	203
	204	# add tool-specific arguments
	205	converter.add_specific_args(parser)
	206
	207	# parse arguments and perform some basic, common validation
	208	args = parser.parse_args()
	209	validate_and_prepare_common_arguments(args)
	210
	211	# parse the input CTD files into CTDModels
	212	parsed_ctds = utils.parse_input_ctds(args.xsd_location, args.input_files, args.output_destination,
	213	converter.get_preferred_file_extension())
	214
	215	# let the converter do its own thing
	216	return converter.convert_models(args, parsed_ctds)
	217
	218	except KeyboardInterrupt:
	219	# handle keyboard interrupt
	220	return 0
	221
	222	except ApplicationException, e:
	223	utils.error("CTDConverter could not complete the requested operation.", 0)
	224	utils.error("Reason: " + e.msg, 0)
	225	return 1
	226
	227	except ModelError, e:
	228	utils.error("There seems to be a problem with one of your input CTDs.", 0)
	229	utils.error("Reason: " + e.msg, 0)
	230	return 1
	231
	232	except Exception, e:
	233	traceback.print_exc()
	234	return 2
	235
	236	return 0
	237
	238
	239	def validate_and_prepare_common_arguments(args):
	240	# flatten lists of lists to a list containing elements
	241	lists_to_flatten = ["input_files", "blacklisted_parameters"]
	242	for list_to_flatten in lists_to_flatten:
	243	utils.flatten_list_of_lists(args, list_to_flatten)
	244
	245	# if input is a single file, we expect output to be a file (and not a dir that already exists)
	246	if len(args.input_files) == 1:
	247	if os.path.isdir(args.output_destination):
	248	raise ApplicationException("If a single input file is provided, output (%s) is expected to be a file "
	249	"and not a folder.\n" % args.output_destination)
	250
	251	# if input is a list of files, we expect output to be a folder
	252	if len(args.input_files) > 1:
	253	if not os.path.isdir(args.output_destination):
	254	raise ApplicationException("If several input files are provided, output (%s) is expected to be an "
	255	"existing directory.\n" % args.output_destination)
	256
	257	# check that the provided input files, if provided, contain a valid file path
	258	input_arguments_to_check = ["xsd_location", "input_files", "hardcoded_parameters"]
	259	for argument_name in input_arguments_to_check:
	260	utils.validate_argument_is_valid_path(args, argument_name)
	261
	262
	263	if __name__ == "__main__":
	264	sys.exit(main())⏎

+24

-222

galaxy/README.md less more

0	0	# Conversion of CTD Files to Galaxy ToolConfigs
	1	## Generating a `tool_conf.xml` File
	2	* Purpose: Galaxy uses a file `tool_conf.xml` in which other tools can be included. `CTDConverter` can also generate this file. Categories will be extracted from the provided input CTDs and for each category, a different `<section>` will be generated. Any input CTD lacking a category will be sorted under the provided default category.
	3	* Short/long version: `-t` / `--tool-conf-destination`
	4	* Required: no.
	5	* Taken values: The destination of the file.
1	6
2		## How to use: most common Tasks
	7	$ python convert.py galaxy -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -t /data/generated-galaxy-stubs/tool_conf.xml
3	8
4		The Galaxy ToolConfig generator takes several parameters and a varying number of inputs and outputs. The following sub-sections show how to perform the most common operations.
5
6		Running the generator with the `-h/--help` parameter will print extended information about each of the parameters.
7
8		### Macros
9
10		Galaxy supports the use of macros via a `macros.xml` file (we provide a sample macros file in [macros.xml]). Instead of repeating sections, macros can be used and expanded. If you want fine control over the macros, you can use the `-m` / `--macros` parameter to provide your own macros file.
11
12		Please note that the used macros file must be copied to your Galaxy installation on the same location in which you place the generated ToolConfig files, otherwise Galaxy will not be able to parse the generated ToolConfig files!
13
14		### One input, one Output
15
16		In its simplest form, the converter takes an input CTD file and generates an output Galaxy ToolConfig file. The following usage of `generator.py`:
17
18		$ python generator.py -i /data/sample_input.ctd -o /data/sample_output.xml
19
20		will parse `/data/sample_input.ctd` and generate a Galaxy tool wrapper under `/data/sample_output.xml`. The generated file can be added to your Galaxy instance like any other tool.
21
22		### Converting several CTDs at once
23
24		When converting several CTDs, the expected value for the `-o`/`--output` parameter is a folder. For example:
25
26		$ python generator.py -i /data/ctds/one.ctd /data/ctds/two.ctd -o /data/generated-galaxy-stubs
27
28		Will convert `/data/ctds/one.ctd` into `/data/generated-galaxy-stubs/one.xml` and `/data/ctds/two.ctd` into `/data/generated-galaxy-stubs/two.xml`.
29
30		You can use wildcard expansion, as supported by most modern operating systems:
31
32		$ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs
33
34		### Generating a tool_conf.xml File
35
36		The generator supports generation of a `tool_conf.xml` file which you can later use in your local Galaxy installation. The parameter `-t`/`--tool-conf-destination` contains the path of a file in which a `tool_conf.xml` file will be generated.
37
38		$ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -t /data/generated-galaxy-stubs/tool_conf.xml
39
40
41		## How to use: Parameters in Detail
42
43		### A Word about Parameters taking Lists of Values
44
45		All parameters have a short and a long option and some parameters take list of values. Using either the long or the short option of the parameter will produce the same output. The following examples show how to pass values using the `-f` / `--foo` parameter:
46
47		The following uses of the parameter will pass the list of values containing `bar`, `blah` and `blu`:
48
49		-f bar blah blu
50		--foo bar blah blu
51		-f bar -f blah -f blu
52		--foo bar --foo blah --foo blu
53		-f bar --foo blah blu
54
55		The following uses of the parameter will pass a single value `bar`:
56
57		-f bar
58		--foo bar
59
60		### Schema Validation
61
62		* Purpose: Provide validation of input CTDs against a schema file (i.e, a XSD file).
63		* Short/long version: `v` / `--validation-schema`
64		* Required: no.
65		* Taken values: location of the schema file (e.g., CTD.xsd).
66
67		CTDs can be validated against a schema. The master version of the schema can be found under [CTDSchema].
68
69		If a schema is provided, all input CTDs will be validated against it.
70
71		### Input File(s)
72
73		* Purpose: Provide input CTD file(s) to convert.
74		* Short/long version: `-i` / `--input`
75		* Required: yes.
76		* Taken values: a list of input CTD files.
77
78		Example:
79
80		Any of the following invocations will convert `/data/input_one.ctd` and `/data/input_two.ctd`:
81
82		$ python generator.py -i /data/input_one.ctd -i /data/input_two.ctd -o /data/generated
83		$ python generator.py -i /data/input_one.ctd /data/input_two.ctd -o /data/generated
84		$ python generator.py --input /data/input_one.ctd /data/input_two.ctd -o /data/generated
85		$ python generator.py --input /data/input_one.ctd --input /data/input_two.ctd -o /data/generated
86
87		The following invocation will convert `/data/input.ctd` into `/data/output.xml`:
88
89		$ python generator.py -i /data/input.ctd -o /data/output.xml -m sample_files/macros.xml
90
91		Of course, you can also use wildcards, which will be automatically expanded by any modern operating system. This is extremely useful if you want to convert several files at a time. Imagine that the folder `/data/ctds` contains three files, `input_one.ctd`, `input_two.ctd` and `input_three.ctd`. The following two invocations will produce the same output in the `/data/galaxy`:
92
93		$ python generator.py -i /data/input_one.ctd /data/input_two.ctd /data/input_three.ctd -o /data/galaxy
94		$ python generator.py -i /data/*.ctd -o /data/galaxy
95
96		### Finer Control over the Tools to be converted
97
98		Sometimes only a set of CTDs in a folder need to be converted. The parameter `-r`/`--required-tools` takes the path a file containing the names of tools that will be converted.
99
100		$ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -r required_tools.txt
101
102		On the other hand, if you want the generator to skip conversion of some CTDs, the parameter `-s`/`--skip-tools` will take the path of a file containing the names of tools that will not be converted.
103
104		$ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -s skipped_tools.txt
105
106		The format of these files (`required_tools.txt`, `skipped_tools.txt` in the examples above) is straightforward. Each line contains the name of a tool and any line starting with `#` will be ignored.
107
108		### Output Destination
109
110		* Purpose: Provide output destination for the generated Galaxy ToolConfig files.
111		* Short/long version: `-o` / `--output-destination`
112		* Required: yes.
113		* Taken values: if a single input file is given, then a single output file is expected. If multiple input files are given, then an existent folder, in which all generated Galaxy ToolConfig will be written, is expected.
114
115		Example:
116
117		A single input is given, and the output will be generated into `/data/output.xml`:
118
119		$ python generator.py -i /data/input.ctd -o /data/output.xml
120
121		Several inputs are given. The output is the already existent folder, `/data/stubs`, and at the end of the operation, the files `/data/stubs/input_one.ctd.xml` and `/data/stubs/input_two.ctd.xml` will be generated:
122
123		$ python generator.py -i /data/ctds/input_one.ctd /data/ctds/input_two.ctd -o /data/stubs
124
125
126		### Adding Parameters to the Command-line
127
	9	## Adding Parameters to the Command-line
128	10	* Purpose: Galaxy ToolConfig files include a `<command>` element in which the command line to invoke the tool can be given. Sometimes it is needed to invoke your tools in a certain way (i.e., passing certain parameters). For instance, some tools offer the possibility to be invoked in a verbose or quiet way or even to be invoked in a headless way (i.e., without GUI).
129	11	* Short/long version: `-a` / `--add-to-command-line`
130	12	* Required: no.

132	14
133	15	Example:
134	16
135		$ python generator.py ... -a "--quiet --no-gui"
	17	$ python convert.py galaxy ... -a "--quiet --no-gui"
136	18
137	19	Will generate the following `<command>` element in the generated Galaxy ToolConfig:
138	20
139	21	<command>TOOL_NAME --quiet --no-gui ...</command>
140	22
141
142		### Blacklisting Parameters
143
144		* Purpose: Some parameters present in the CTD are not to be exposed on Galaxy. Think of parameters such as `--help`, `--debug`, that might won't make much sense to be exposed to final users in a workflow management system such as Galaxy.
145		* Short/long version: `-b` / `--blacklist-parameters`
146		* Required: no.
147		* Taken values: A list of parameters to be blacklisted.
148
149		Example:
150
151		$ python generator.py ... -b h help quiet
152
153		Will not process any of the parameters named `h`, `help`, or `quiet` and will not appear in the generated Galaxy ToolConfig.
154
155		### Generating a tool_conf.xml file
156
157		* Purpose: Galaxy uses a file `tool_conf.xml` in which other tools can be included. `generator.py` can also generate this file. Categories will be extracted from the provided input CTDs and for each category, a different `<section>` will be generated. Any input CTD lacking a category will be sorted under the provided default category.
158		* Short/long version: `-t` / `--tool-conf-destination`
159		* Required: no.
160		* Taken values: The destination of the file.
161
162		### Providing a default Category
163
164		* Purpose: Input CTDs that lack a category will be sorted under the value given to this parameter. If this parameter is not given, then the category `DEFAULT` will be used.
	23	## Providing a default Category
	24	* Purpose: Input CTDs that lack a category will be sorted under the value given to this parameter. If this parameter is not provided, then the category `DEFAULT` will be used.
165	25	* Short/long version: `-c` / `--default-category`
166	26	* Required: no.
167	27	* Taken values: The value for the default category to use for input CTDs lacking a category.

170	30
171	31	Suppose there is a folder containing several CTD files. Some of those CTDs don't have the optional attribute `category` and the rest belong to the `Data Processing` category. The following invocation:
172	32
173		$ python generator.py ... -c Other
	33	$ python convert.py galaxy ... -c Other
174	34
175	35	will generate, for each of the categories, a different section. Additionally, CTDs lacking a category will be sorted under the given category, `Other`, as shown:
176	36

186	46	...
187	47	</section>
188	48
189		### Providing a Path for the Location of the ToolConfig Files
190
191		* Purpose: The `tool_conf.xml` file contains references to files which in turn contain Galaxy ToolConfig files. Using this parameter, you can provide information about the location of your tools.
	49	## Providing a Path for the Location of the ToolConfig Files
	50	* Purpose: The `tool_conf.xml` file contains references to files which in turn contain Galaxy ToolConfig files. Using this parameter, you can provide information about the location of your wrappers on your Galaxy instance.
192	51	* Short/long version: `-g` / `--galaxy-tool-path`
193	52	* Required: no.
194	53	* Taken values: The path relative to your `$GALAXY_ROOT/tools` folder on which your tools are located.
195	54
196	55	Example:
197	56
198		$ python generator.py ... -g my_tools_folder
	57	$ python convert.py galaxy ... -g my_tools_folder
199	58
200	59	Will generate `<tool>` elements in the generated `tool_conf.xml` as follows:
201	60

203	62
204	63	In this example, `tool_conf.xml` refers to a file located on `$GALAXY_ROOT/tools/my_tools_folder/some_tool.xml`.
205	64
206
207		### Hardcoding Parameters
208
209		* Purpose: Fixing the value of a parameter and hide it from the end user.
210		* Short/long version: `-p` / `--hardcoded-parameters`
211		* Required: no.
212		* Taken values: The path of a file containing the mapping between parameter names and hardcoded values to use in the `<command>` section.
213
214		It is sometimes required that parameters are hidden from the end user in workflow systems such as Galaxy and that they take a predetermined value. Allowing end users to control parameters similar to `--verbosity`, `--threads`, etc., might create more problems than solving them. For this purpose, the parameter `p`/`--hardcoded-parameters` takes the path of a file that contains up to three columns separated by whitespace that map parameter names to the hardcoded value. The first column contains the name of the parameter and the second one the hardcoded value. The first two columns are mandatory.
215
216		If the parameter is to be hardcoded only for certain tools, a third column containing a comma separated list of tool names for which the hardcoding will apply can be added.
217
218		Lines starting with `#` will be ignored. The following is an example of a valid file:
219
220		# Parameter name # Value # Tool(s)
221		threads \${GALAXY_SLOTS:-24}
222		mode quiet
223		xtandem_executable xtandem XTandemAdapter
224		verbosity high Foo, Bar
225
226		This will produce a `<command>` section similar to the following one for all tools but `XTandemAdapter`, `Foo` and `Bar`:
227
228		<command>TOOL_NAME -threads \${GALAXY_SLOTS:-24} -mode quiet ...</command>
229
230		For `XTandemAdapter`, the `<command>` will be similar to:
231
232		<command>XtandemAdapter ... -threads \${GALAXY_SLOTS:-24} -mode quiet -xtandem_executable xtandem ...</command>
233
234		And for tools `Foo` and `Bar`, the `<command>` will be similar to:
235
236		<command>Foo ... ... -threads \${GALAXY_SLOTS:-24} -mode quiet -verbosity high ...</command>
237
238
239		### Including additional Macros Files
240
	65	## Including additional Macros Files
241	66	* Purpose: Include external macros files.
242	67	* Short/long version: `-m` / `--macros`
243	68	* Required: no.

246	71
247	72	ToolConfig supports elaborate sections such as `<stdio>`, `<requirements>`, etc., that are identical across tools of the same suite. Macros files assist in the task of including external xml sections into ToolConfig files. For more information about the syntax of macros files, see: https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax#Reusing_Repeated_Configuration_Elements
248	73
249		There are some macros that are required, namely `stdio`, `requirements` and `advanced_options`. A template macro file is included in [macros.xml]. It can be edited to suit your needs and you could add extra macros or leave it as it is and include additional files.
	74	There are some macros that are required, namely `stdio`, `requirements` and `advanced_options`. A template macro file is included in [macros.xml]. It can be edited to suit your needs and you could add extra macros or leave it as it is and include additional files. Every macro found in the provided files will be expanded.
250	75
251		Every macro found in the included files and in `support_files/macros.xml` will be expanded. Users are responsible for copying the given macros files in their corresponding galaxy folders.
	76	Please note that the used macros files must be copied to your Galaxy installation on the same location in which you place the generated ToolConfig files, otherwise Galaxy will not be able to parse the generated ToolConfig files!
252	77
253		### Providing a default executable Path
254
255		* Purpose: Help Galaxy locate tools by providing a path.
256		* Short/long version: `-x` / `--default-executable-path`
257		* Required: no.
258		* Taken values: The default executable path of the tools in the Galaxy installation.
259
260		CTDs can contain an `<executablePath>` element that will be used when executing the tool binary. If this element is missing, the value provided by this parameter will be used as a prefix when building the `<command>` section. Suppose that you have installed a tool suite in your local Galaxy instance under `/opt/suite/bin`. The following invocation of the converter:
261
262		$ python generator.py -x /opt/suite/bin ...
263
264		Will produce a `<command>` section similar to:
265
266		<command>/opt/suite/bin/Foo ...</command>
267
268		For those CTDs in which no `<executablePath>` could be found.
269
270
271		### Generating a `datatypes_conf.xml` File
272
	78	## Generating a `datatypes_conf.xml` File
273	79	* Purpose: Specify the destination of a generated `datatypes_conf.xml` file.
274	80	* Short/long version: `-d` / `--datatypes-destination`
275	81	* Required: no.

277	83
278	84	It is likely that your tools use file formats or mimetypes that have not been registered in Galaxy. The generator allows you to specify a path in which an automatically generated `datatypes_conf.xml` file will be created. Consult the next section to get information about how to register file formats and mimetypes.
279	85
280
281		### Providing Galaxy File Formats
282
	86	## Providing Galaxy File Formats
283	87	* Purpose: Register new file formats and mimetypes.
284	88	* Short/long version: `-f` / `--formats-file`
285	89	* Required: no.

307	111
308	112	For information about Galaxy data types and subclasses, consult the following page: https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
309	113
310
311		## Notes about some of the OpenMS Tools
312
313		* Most of the tools can be generated automatically. Some of the tools need some extra work (for now).
314		* These adapters need to be changed, such that you provide the path to the executable:
	114	## Remarks about some of the OpenMS Tools
	115	* Most of the tools can be generated automatically. However, some of the tools need some extra work (for now).
	116	* The following adapters need to be changed, such that you provide the path to the executable:
315	117	* FidoAdapter (add `-exe fido` in the command tag, delete the `$param_exe` in the command tag, delete the parameter from the input list).
316	118	* MSGFPlusAdapter (add `-executable msgfplus.jar` in the command tag, delete the `$param_executable` in the command tag, delete the parameter from the input list).
317	119	* MyriMatchAdapter (add `-myrimatch_executable myrimatch` in the command tag, delete the `$param_myrimatch_executable` in the command tag, delete the parameter from the input list).

320	122	* XTandemAdapter (add `-xtandem_executable xtandem` in the command tag, delete the $param_xtandem_executable in the command tag, delete the parameter from the input list).
321	123	* To avoid the deletion in the inputs you can also add these parameters to the blacklist
322	124
323		$ python generator.py -b exe executable myrimatch_excutable omssa_executable pepnovo_executable xtandem_executable
	125	$ python convert.py galaxy -b exe executable myrimatch_excutable omssa_executable pepnovo_executable xtandem_executable
324	126
325		* These tools have multiple outputs (number of inputs = number of outputs) which is not yet supported in Galaxy-stable:
	127	* The following tools have multiple outputs (number of inputs = number of outputs) which is not yet supported in Galaxy-stable:
326	128	* SeedListGenerator
327	129	* SpecLibSearcher
328	130	* MapAlignerIdentification

-0

galaxy/__init__.py less more

(New empty file)

+992

-0

galaxy/converter.py less more

	0	#!/usr/bin/env python
	1	# encoding: utf-8
	2
	3	"""
	4	@author: delagarza
	5	"""
	6
	7	import os
	8	import string
	9
	10	from collections import OrderedDict
	11	from string import strip
	12	from lxml import etree
	13	from lxml.etree import SubElement, Element, ElementTree, ParseError, parse
	14
	15	from common import utils, logger
	16	from common.exceptions import ApplicationException, InvalidModelException
	17	from common.utils import ParsedCTD
	18
	19	from CTDopts.CTDopts import _InFile, _OutFile, ParameterGroup, _Choices, _NumericRange, _FileFormat, ModelError, _Null
	20
	21
	22	TYPE_TO_GALAXY_TYPE = {int: 'integer', float: 'float', str: 'text', bool: 'boolean', _InFile: 'data',
	23	_OutFile: 'data', _Choices: 'select'}
	24	STDIO_MACRO_NAME = "stdio"
	25	REQUIREMENTS_MACRO_NAME = "requirements"
	26	ADVANCED_OPTIONS_MACRO_NAME = "advanced_options"
	27
	28	REQUIRED_MACROS = [STDIO_MACRO_NAME, REQUIREMENTS_MACRO_NAME, ADVANCED_OPTIONS_MACRO_NAME]
	29
	30
	31	class ExitCode:
	32	def __init__(self, code_range="", level="", description=None):
	33	self.range = code_range
	34	self.level = level
	35	self.description = description
	36
	37
	38	class DataType:
	39	def __init__(self, extension, galaxy_extension=None, galaxy_type=None, mimetype=None):
	40	self.extension = extension
	41	self.galaxy_extension = galaxy_extension
	42	self.galaxy_type = galaxy_type
	43	self.mimetype = mimetype
	44
	45
	46	def add_specific_args(parser):
	47	parser.add_argument("-f", "--formats-file", dest="formats_file",
	48	help="File containing the supported file formats. Run with '-h' or '--help' to see a "
	49	"brief example on the layout of this file.", default=None, required=False)
	50	parser.add_argument("-a", "--add-to-command-line", dest="add_to_command_line",
	51	help="Adds content to the command line", default="", required=False)
	52	parser.add_argument("-d", "--datatypes-destination", dest="data_types_destination",
	53	help="Specify the location of a datatypes_conf.xml to modify and add the registered "
	54	"data types. If the provided destination does not exist, a new file will be created.",
	55	default=None, required=False)
	56	parser.add_argument("-c", "--default-category", dest="default_category", default="DEFAULT", required=False,
	57	help="Default category to use for tools lacking a category when generating tool_conf.xml")
	58	parser.add_argument("-t", "--tool-conf-destination", dest="tool_conf_destination", default=None, required=False,
	59	help="Specify the location of an existing tool_conf.xml that will be modified to include "
	60	"the converted tools. If the provided destination does not exist, a new file will"
	61	"be created.")
	62	parser.add_argument("-g", "--galaxy-tool-path", dest="galaxy_tool_path", default=None, required=False,
	63	help="The path that will be prepended to the file names when generating tool_conf.xml")
	64	parser.add_argument("-r", "--required-tools", dest="required_tools_file", default=None, required=False,
	65	help="Each line of the file will be interpreted as a tool name that needs translation. "
	66	"Run with '-h' or '--help' to see a brief example on the format of this file.")
	67	parser.add_argument("-s", "--skip-tools", dest="skip_tools_file", default=None, required=False,
	68	help="File containing a list of tools for which a Galaxy stub will not be generated. "
	69	"Run with '-h' or '--help' to see a brief example on the format of this file.")
	70	parser.add_argument("-m", "--macros", dest="macros_files", default=[], nargs="*",
	71	action="append", required=None, help="Import the additional given file(s) as macros. "
	72	"The macros stdio, requirements and advanced_options are required. Please see "
	73	"macros.xml for an example of a valid macros file. Al defined macros will be imported.")
	74
	75
	76	def convert_models(args, parsed_ctds): # IGNORE:C0111
	77	# validate and prepare the passed arguments
	78	validate_and_prepare_args(args)
	79
	80	# extract the names of the macros and check that we have found the ones we need
	81	macros_to_expand = parse_macros_files(args.macros_files)
	82
	83	# parse the given supported file-formats file
	84	supported_file_formats = parse_file_formats(args.formats_file)
	85
	86	# parse the hardcoded parameters file¬
	87	parameter_hardcoder = utils.parse_hardcoded_parameters(args.hardcoded_parameters)
	88
	89	# parse the skip/required tools files
	90	skip_tools = parse_tools_list_file(args.skip_tools_file)
	91	required_tools = parse_tools_list_file(args.required_tools_file)
	92
	93	_convert_internal(parsed_ctds,
	94	supported_file_formats=supported_file_formats,
	95	default_executable_path=args.default_executable_path,
	96	add_to_command_line=args.add_to_command_line,
	97	blacklisted_parameters=args.blacklisted_parameters,
	98	required_tools=required_tools,
	99	skip_tools=skip_tools,
	100	macros_file_names=args.macros_files,
	101	macros_to_expand=macros_to_expand,
	102	parameter_hardcoder=parameter_hardcoder)
	103
	104	# generation of galaxy stubs is ready... now, let's see if we need to generate a tool_conf.xml
	105	if args.tool_conf_destination is not None:
	106	generate_tool_conf(parsed_ctds, args.tool_conf_destination,
	107	args.galaxy_tool_path, args.default_category)
	108
	109	# generate datatypes_conf.xml
	110	if args.data_types_destination is not None:
	111	generate_data_type_conf(supported_file_formats, args.data_types_destination)
	112
	113	return 0
	114
	115
	116	def parse_tools_list_file(tools_list_file):
	117	tools_list = None
	118	if tools_list_file is not None:
	119	tools_list = []
	120	with open(tools_list_file) as f:
	121	for line in f:
	122	if line is None or not line.strip() or line.strip().startswith("#"):
	123	continue
	124	else:
	125	tools_list.append(line.strip())
	126
	127	return tools_list
	128
	129
	130	def parse_macros_files(macros_file_names):
	131	macros_to_expand = set()
	132
	133	for macros_file_name in macros_file_names:
	134	try:
	135	macros_file = open(macros_file_name)
	136	logger.info("Loading macros from %s" % macros_file_name, 0)
	137	root = parse(macros_file).getroot()
	138	for xml_element in root.findall("xml"):
	139	name = xml_element.attrib["name"]
	140	if name in macros_to_expand:
	141	logger.warning("Macro %s has already been found. Duplicate found in file %s." %
	142	(name, macros_file_name), 0)
	143	else:
	144	logger.info("Macro %s found" % name, 1)
	145	macros_to_expand.add(name)
	146	except ParseError, e:
	147	raise ApplicationException("The macros file " + macros_file_name + " could not be parsed. Cause: " +
	148	str(e))
	149	except IOError, e:
	150	raise ApplicationException("The macros file " + macros_file_name + " could not be opened. Cause: " +
	151	str(e))
	152
	153	# we depend on "stdio", "requirements" and "advanced_options" to exist on all the given macros files
	154	missing_needed_macros = []
	155	for required_macro in REQUIRED_MACROS:
	156	if required_macro not in macros_to_expand:
	157	missing_needed_macros.append(required_macro)
	158
	159	if missing_needed_macros:
	160	raise ApplicationException(
	161	"The following required macro(s) were not found in any of the given macros files: %s, "
	162	"see galaxy/macros.xml for an example of a valid macros file."
	163	% ", ".join(missing_needed_macros))
	164
	165	# we do not need to "expand" the advanced_options macro
	166	macros_to_expand.remove(ADVANCED_OPTIONS_MACRO_NAME)
	167	return macros_to_expand
	168
	169
	170	def parse_file_formats(formats_file):
	171	supported_formats = {}
	172	if formats_file is not None:
	173	line_number = 0
	174	with open(formats_file) as f:
	175	for line in f:
	176	line_number += 1
	177	if line is None or not line.strip() or line.strip().startswith("#"):
	178	# ignore (it'd be weird to have something like:
	179	# if line is not None and not (not line.strip()) ...
	180	pass
	181	else:
	182	# not an empty line, no comment
	183	# strip the line and split by whitespace
	184	parsed_formats = line.strip().split()
	185	# valid lines contain either one or four columns
	186	if not (len(parsed_formats) == 1 or len(parsed_formats) == 3 or len(parsed_formats) == 4):
	187	logger.warning(
	188	"Invalid line at line number %d of the given formats file. Line will be ignored:\n%s" %
	189	(line_number, line), 0)
	190	# ignore the line
	191	continue
	192	elif len(parsed_formats) == 1:
	193	supported_formats[parsed_formats[0]] = DataType(parsed_formats[0], parsed_formats[0])
	194	else:
	195	mimetype = None
	196	# check if mimetype was provided
	197	if len(parsed_formats) == 4:
	198	mimetype = parsed_formats[3]
	199	supported_formats[parsed_formats[0]] = DataType(parsed_formats[0], parsed_formats[1],
	200	parsed_formats[2], mimetype)
	201	return supported_formats
	202
	203
	204	def validate_and_prepare_args(args):
	205	# check that only one of skip_tools_file and required_tools_file has been provided
	206	if args.skip_tools_file is not None and args.required_tools_file is not None:
	207	raise ApplicationException(
	208	"You have provided both a file with tools to ignore and a file with required tools.\n"
	209	"Only one of -s/--skip-tools, -r/--required-tools can be provided.")
	210
	211	# flatten macros_files to make sure that we have a list containing file names and not a list of lists
	212	utils.flatten_list_of_lists(args, "macros_files")
	213
	214	# check that the arguments point to a valid, existing path
	215	input_variables_to_check = ["skip_tools_file", "required_tools_file", "macros_files", "formats_file"]
	216	for variable_name in input_variables_to_check:
	217	utils.validate_argument_is_valid_path(args, variable_name)
	218
	219	# check that the provided output files, if provided, contain a valid file path (i.e., not a folder)
	220	output_variables_to_check = ["data_types_destination", "tool_conf_destination"]
	221	for variable_name in output_variables_to_check:
	222	file_name = getattr(args, variable_name)
	223	if file_name is not None and os.path.isdir(file_name):
	224	raise ApplicationException("The provided output file name (%s) points to a directory." % file_name)
	225
	226	if not args.macros_files:
	227	# list is empty, provide the default value
	228	logger.warning("Using default macros from galaxy/macros.xml", 0)
	229	args.macros_files = ["galaxy/macros.xml"]
	230
	231
	232	def get_preferred_file_extension():
	233	return "xml"
	234
	235
	236	def _convert_internal(parsed_ctds, **kwargs):
	237	# parse all input files into models using CTDopts (via utils)
	238	# the output is a tuple containing the model, output destination, origin file
	239	for parsed_ctd in parsed_ctds:
	240	model = parsed_ctd.ctd_model
	241	origin_file = parsed_ctd.input_file
	242	output_file = parsed_ctd.suggested_output_file
	243
	244	if kwargs["skip_tools"] is not None and model.name in kwargs["skip_tools"]:
	245	logger.info("Skipping tool %s" % model.name, 0)
	246	continue
	247	elif kwargs["required_tools"] is not None and model.name not in kwargs["required_tools"]:
	248	logger.info("Tool %s is not required, skipping it" % model.name, 0)
	249	continue
	250	else:
	251	logger.info("Converting from %s " % origin_file, 0)
	252	tool = create_tool(model)
	253	write_header(tool, model)
	254	create_description(tool, model)
	255	expand_macros(tool, model, **kwargs)
	256	create_command(tool, model, **kwargs)
	257	create_inputs(tool, model, **kwargs)
	258	create_outputs(tool, model, **kwargs)
	259	create_help(tool, model)
	260
	261	# wrap our tool element into a tree to be able to serialize it
	262	tree = ElementTree(tool)
	263	tree.write(open(output_file, 'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
	264
	265
	266	def write_header(tool, model):
	267	tool.addprevious(etree.Comment(
	268	"This is a configuration file for the integration of a tools into Galaxy (https://galaxyproject.org/). "
	269	"This file was automatically generated using CTDConverter."))
	270	tool.addprevious(etree.Comment('Proposed Tool Section: [%s]' % model.opt_attribs.get("category", "")))
	271
	272
	273	def generate_tool_conf(parsed_ctds, tool_conf_destination, galaxy_tool_path, default_category):
	274	# for each category, we keep a list of models corresponding to it
	275	categories_to_tools = dict()
	276	for parsed_ctd in parsed_ctds:
	277	category = strip(parsed_ctd.ctd_model.opt_attribs.get("category", ""))
	278	if not category.strip():
	279	category = default_category
	280	if category not in categories_to_tools:
	281	categories_to_tools[category] = []
	282	categories_to_tools[category].append(utils.get_filename(parsed_ctd.suggested_output_file))
	283
	284	# at this point, we should have a map for all categories->tools
	285	toolbox_node = Element("toolbox")
	286
	287	if galaxy_tool_path is not None and not galaxy_tool_path.strip().endswith("/"):
	288	galaxy_tool_path = galaxy_tool_path.strip() + "/"
	289	if galaxy_tool_path is None:
	290	galaxy_tool_path = ""
	291
	292	for category, file_names in categories_to_tools.iteritems():
	293	section_node = add_child_node(toolbox_node, "section")
	294	section_node.attrib["id"] = "section-id-" + "".join(category.split())
	295	section_node.attrib["name"] = category
	296
	297	for filename in file_names:
	298	tool_node = add_child_node(section_node, "tool")
	299	tool_node.attrib["file"] = galaxy_tool_path + filename
	300
	301	toolconf_tree = ElementTree(toolbox_node)
	302	toolconf_tree.write(open(tool_conf_destination,'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
	303	logger.info("Generated Galaxy tool_conf.xml in %s" % tool_conf_destination, 0)
	304
	305
	306	def generate_data_type_conf(supported_file_formats, data_types_destination):
	307	data_types_node = Element("datatypes")
	308	registration_node = add_child_node(data_types_node, "registration")
	309	registration_node.attrib["converters_path"] = "lib/galaxy/datatypes/converters"
	310	registration_node.attrib["display_path"] = "display_applications"
	311
	312	for format_name in supported_file_formats:
	313	data_type = supported_file_formats[format_name]
	314	# add only if it's a data type that does not exist in Galaxy
	315	if data_type.galaxy_type is not None:
	316	data_type_node = add_child_node(registration_node, "datatype")
	317	# we know galaxy_extension is not None
	318	data_type_node.attrib["extension"] = data_type.galaxy_extension
	319	data_type_node.attrib["type"] = data_type.galaxy_type
	320	if data_type.mimetype is not None:
	321	data_type_node.attrib["mimetype"] = data_type.mimetype
	322
	323	data_types_tree = ElementTree(data_types_node)
	324	data_types_tree.write(open(data_types_destination,'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
	325	logger.info("Generated Galaxy datatypes_conf.xml in %s" % data_types_destination, 0)
	326
	327
	328	def create_tool(model):
	329	return Element("tool", OrderedDict([("id", model.name), ("name", model.name), ("version", model.version)]))
	330
	331
	332	def create_description(tool, model):
	333	if "description" in model.opt_attribs.keys() and model.opt_attribs["description"] is not None:
	334	description = SubElement(tool,"description")
	335	description.text = model.opt_attribs["description"]
	336
	337
	338	def get_param_cli_name(param, model):
	339	# we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy)
	340	if type(param.parent) == ParameterGroup:
	341	if not hasattr(param.parent.parent, 'parent'):
	342	return resolve_param_mapping(param, model)
	343	elif not hasattr(param.parent.parent.parent, 'parent'):
	344	return resolve_param_mapping(param, model)
	345	else:
	346	if model.cli:
	347	logger.warning("Using nested parameter sections (NODE elements) is not compatible with <cli>", 1)
	348	return get_param_name(param.parent) + ":" + resolve_param_mapping(param, model)
	349	else:
	350	return resolve_param_mapping(param, model)
	351
	352
	353	def get_param_name(param):
	354	# we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy)
	355	if type(param.parent) == ParameterGroup:
	356	if not hasattr(param.parent.parent, 'parent'):
	357	return param.name
	358	elif not hasattr(param.parent.parent.parent, 'parent'):
	359	return param.name
	360	else:
	361	return get_param_name(param.parent) + ":" + param.name
	362	else:
	363	return param.name
	364
	365
	366	# some parameters are mapped to command line options, this method helps resolve those mappings, if any
	367	def resolve_param_mapping(param, model):
	368	# go through all mappings and find if the given param appears as a reference name in a mapping element
	369	param_mapping = None
	370	for cli_element in model.cli:
	371	for mapping_element in cli_element.mappings:
	372	if mapping_element.reference_name == param.name:
	373	if param_mapping is not None:
	374	logger.warning("The parameter %s has more than one mapping in the <cli> section. "
	375	"The first found mapping, %s, will be used." % (param.name, param_mapping), 1)
	376	else:
	377	param_mapping = cli_element.option_identifier
	378
	379	return param_mapping if param_mapping is not None else param.name
	380
	381
	382	def create_command(tool, model, **kwargs):
	383	final_command = get_tool_executable_path(model, kwargs["default_executable_path"]) + '\n'
	384	final_command += kwargs["add_to_command_line"] + '\n'
	385	advanced_command_start = "#if $adv_opts.adv_opts_selector=='advanced':\n"
	386	advanced_command_end = '#end if'
	387	advanced_command = ''
	388	parameter_hardcoder = kwargs["parameter_hardcoder"]
	389
	390	found_output_parameter = False
	391	for param in extract_parameters(model):
	392	if param.type is _OutFile:
	393	found_output_parameter = True
	394	command = ''
	395	param_name = get_param_name(param)
	396	param_cli_name = get_param_cli_name(param, model)
	397	if param_name == param_cli_name:
	398	# there was no mapping, so for the cli name we will use a '-' in the prefix
	399	param_cli_name = '-' + param_name
	400
	401	if param.name in kwargs["blacklisted_parameters"]:
	402	continue
	403
	404	hardcoded_value = parameter_hardcoder.get_hardcoded_value(param_name, model.name)
	405	if hardcoded_value:
	406	command += '%s %s\n' % (param_cli_name, hardcoded_value)
	407	else:
	408	# parameter is neither blacklisted nor hardcoded...
	409	galaxy_parameter_name = get_galaxy_parameter_name(param)
	410	repeat_galaxy_parameter_name = get_repeat_galaxy_parameter_name(param)
	411
	412	# logic for ITEMLISTs
	413	if param.is_list:
	414	if param.type is _InFile:
	415	command += param_cli_name + "\n"
	416	command += " #for token in $" + galaxy_parameter_name + ":\n"
	417	command += " $token\n"
	418	command += " #end for\n"
	419	else:
	420	command += "\n#if $" + repeat_galaxy_parameter_name + ":\n"
	421	command += param_cli_name + "\n"
	422	command += " #for token in $" + repeat_galaxy_parameter_name + ":\n"
	423	command += " #if \" \" in str(token):\n"
	424	command += " \"$token." + galaxy_parameter_name + "\"\n"
	425	command += " #else\n"
	426	command += " $token." + galaxy_parameter_name + "\n"
	427	command += " #end if\n"
	428	command += " #end for\n"
	429	command += "#end if\n"
	430	# logic for other ITEMs
	431	else:
	432	if param.advanced and param.type is not _OutFile:
	433	actual_parameter = "$adv_opts.%s" % galaxy_parameter_name
	434	else:
	435	actual_parameter = "$%s" % galaxy_parameter_name
	436	# TODO only useful for text fields, integers or floats
	437	# not useful for choices, input fields ...
	438
	439	if not is_boolean_parameter(param) and type(param.restrictions) is _Choices :
	440	command += "#if " + actual_parameter + ":\n"
	441	command += ' %s\n' % param_cli_name
	442	command += " #if \" \" in str(" + actual_parameter + "):\n"
	443	command += " \"" + actual_parameter + "\"\n"
	444	command += " #else\n"
	445	command += " " + actual_parameter + "\n"
	446	command += " #end if\n"
	447	command += "#end if\n"
	448	elif is_boolean_parameter(param):
	449	command += "#if " + actual_parameter + ":\n"
	450	command += ' %s\n' % param_cli_name
	451	command += "#end if\n"
	452	elif TYPE_TO_GALAXY_TYPE[param.type] is 'text':
	453	command += "#if " + actual_parameter + ":\n"
	454	command += " %s " % param_cli_name
	455	command += " \"" + actual_parameter + "\"\n"
	456	command += "#end if\n"
	457	else:
	458	command += "#if " + actual_parameter + ":\n"
	459	command += ' %s ' % param_cli_name
	460	command += actual_parameter + "\n"
	461	command += "#end if\n"
	462
	463	if param.advanced and param.type is not _OutFile:
	464	advanced_command += " %s" % command
	465	else:
	466	final_command += command
	467
	468	if advanced_command:
	469	final_command += "%s%s%s\n" % (advanced_command_start, advanced_command, advanced_command_end)
	470
	471	if not found_output_parameter:
	472	final_command += "> $param_stdout\n"
	473
	474	command_node = add_child_node(tool, "command")
	475	command_node.text = final_command
	476
	477
	478	# creates the xml elements needed to import the needed macros files
	479	# and to "expand" the macros
	480	def expand_macros(tool, model, **kwargs):
	481	macros_node = add_child_node(tool, "macros")
	482	token_node = add_child_node(macros_node, "token")
	483	token_node.attrib["name"] = "@EXECUTABLE@"
	484	token_node.text = get_tool_executable_path(model, kwargs["default_executable_path"])
	485
	486	# add <import> nodes
	487	for macro_file_name in kwargs["macros_file_names"]:
	488	macro_file = open(macro_file_name)
	489	import_node = add_child_node(macros_node, "import")
	490	# do not add the path of the file, rather, just its basename
	491	import_node.text = os.path.basename(macro_file.name)
	492
	493	# add <expand> nodes
	494	for expand_macro in kwargs["macros_to_expand"]:
	495	expand_node = add_child_node(tool, "expand")
	496	expand_node.attrib["macro"] = expand_macro
	497
	498
	499	def get_tool_executable_path(model, default_executable_path):
	500	# rules to build the galaxy executable path:
	501	# if executablePath is null, then use default_executable_path and store it in executablePath
	502	# if executablePath is null and executableName is null, then the name of the tool will be used
	503	# if executablePath is null and executableName is not null, then executableName will be used
	504	# if executablePath is not null and executableName is null,
	505	# then executablePath and the name of the tool will be used
	506	# if executablePath is not null and executableName is not null, then both will be used
	507
	508	# first, check if the model has executablePath / executableName defined
	509	executable_path = model.opt_attribs.get("executablePath", None)
	510	executable_name = model.opt_attribs.get("executableName", None)
	511
	512	# check if we need to use the default_executable_path
	513	if executable_path is None:
	514	executable_path = default_executable_path
	515
	516	# fix the executablePath to make sure that there is a '/' in the end
	517	if executable_path is not None:
	518	executable_path = executable_path.strip()
	519	if not executable_path.endswith('/'):
	520	executable_path += '/'
	521
	522	# assume that we have all information present
	523	command = str(executable_path) + str(executable_name)
	524	if executable_path is None:
	525	if executable_name is None:
	526	command = model.name
	527	else:
	528	command = executable_name
	529	else:
	530	if executable_name is None:
	531	command = executable_path + model.name
	532	return command
	533
	534
	535	def get_galaxy_parameter_name(param):
	536	return "param_%s" % get_param_name(param).replace(':', '_').replace('-', '_')
	537
	538
	539	def get_input_with_same_restrictions(out_param, model, supported_file_formats):
	540	for param in extract_parameters(model):
	541	if param.type is _InFile:
	542	if param.restrictions is not None:
	543	in_param_formats = get_supported_file_types(param.restrictions.formats, supported_file_formats)
	544	out_param_formats = get_supported_file_types(out_param.restrictions.formats, supported_file_formats)
	545	if in_param_formats == out_param_formats:
	546	return param
	547
	548
	549	def create_inputs(tool, model, **kwargs):
	550	inputs_node = SubElement(tool, "inputs")
	551
	552	# some suites (such as OpenMS) need some advanced options when handling inputs
	553	expand_advanced_node = add_child_node(tool, "expand", OrderedDict([("macro", ADVANCED_OPTIONS_MACRO_NAME)]))
	554	parameter_hardcoder = kwargs["parameter_hardcoder"]
	555
	556	# treat all non output-file parameters as inputs
	557	for param in extract_parameters(model):
	558	# no need to show hardcoded parameters
	559	hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name)
	560	if param.name in kwargs["blacklisted_parameters"] or hardcoded_value:
	561	# let's not use an extra level of indentation and use NOP
	562	continue
	563	if param.type is not _OutFile:
	564	if param.advanced:
	565	if expand_advanced_node is not None:
	566	parent_node = expand_advanced_node
	567	else:
	568	# something went wrong... we are handling an advanced parameter and the
	569	# advanced input macro was not set... inform the user about it
	570	logger.info("The parameter %s has been set as advanced, but advanced_input_macro has "
	571	"not been set." % param.name, 1)
	572	# there is not much we can do, other than use the inputs_node as a parent node!
	573	parent_node = inputs_node
	574	else:
	575	parent_node = inputs_node
	576
	577	# for lists we need a repeat tag
	578	if param.is_list and param.type is not _InFile:
	579	rep_node = add_child_node(parent_node, "repeat")
	580	create_repeat_attribute_list(rep_node, param)
	581	parent_node = rep_node
	582
	583	param_node = add_child_node(parent_node, "param")
	584	create_param_attribute_list(param_node, param, kwargs["supported_file_formats"])
	585
	586	# advanced parameter selection should be at the end
	587	# and only available if an advanced parameter exists
	588	if expand_advanced_node is not None and len(expand_advanced_node) > 0:
	589	inputs_node.append(expand_advanced_node)
	590
	591
	592	def get_repeat_galaxy_parameter_name(param):
	593	return "rep_" + get_galaxy_parameter_name(param)
	594
	595
	596	def create_repeat_attribute_list(rep_node, param):
	597	rep_node.attrib["name"] = get_repeat_galaxy_parameter_name(param)
	598	if param.required:
	599	rep_node.attrib["min"] = "1"
	600	else:
	601	rep_node.attrib["min"] = "0"
	602	# for the ITEMLISTs which have LISTITEM children we only
	603	# need one parameter as it is given as a string
	604	if param.default is not None:
	605	rep_node.attrib["max"] = "1"
	606	rep_node.attrib["title"] = get_galaxy_parameter_name(param)
	607
	608
	609	def create_param_attribute_list(param_node, param, supported_file_formats):
	610	param_node.attrib["name"] = get_galaxy_parameter_name(param)
	611
	612	param_type = TYPE_TO_GALAXY_TYPE[param.type]
	613	if param_type is None:
	614	raise ModelError("Unrecognized parameter type %(type)s for parameter %(name)s"
	615	% {"type": param.type, "name": param.name})
	616
	617	if param.is_list:
	618	param_type = "text"
	619
	620	if is_selection_parameter(param):
	621	param_type = "select"
	622	if len(param.restrictions.choices) < 5:
	623	param_node.attrib["display"] = "radio"
	624
	625	if is_boolean_parameter(param):
	626	param_type = "boolean"
	627
	628	if param.type is _InFile:
	629	# assume it's just text unless restrictions are provided
	630	param_format = "txt"
	631	if param.restrictions is not None:
	632	# join all formats of the file, take mapping from supported_file if available for an entry
	633	if type(param.restrictions) is _FileFormat:
	634	param_format = ','.join([get_supported_file_type(i, supported_file_formats) if
	635	get_supported_file_type(i, supported_file_formats)
	636	else i for i in param.restrictions.formats])
	637	else:
	638	raise InvalidModelException("Expected 'file type' restrictions for input file [%(name)s], "
	639	"but instead got [%(type)s]"
	640	% {"name": param.name, "type": type(param.restrictions)})
	641
	642	param_node.attrib["type"] = "data"
	643	param_node.attrib["format"] = param_format
	644	# in the case of multiple input set multiple flag
	645	if param.is_list:
	646	param_node.attrib["multiple"] = "true"
	647
	648	else:
	649	param_node.attrib["type"] = param_type
	650
	651	# check for parameters with restricted values (which will correspond to a "select" in galaxy)
	652	if param.restrictions is not None:
	653	# it could be either _Choices or _NumericRange, with special case for boolean types
	654	if param_type == "boolean":
	655	create_boolean_parameter(param_node, param)
	656	elif type(param.restrictions) is _Choices:
	657	# create as many <option> elements as restriction values
	658	for choice in param.restrictions.choices:
	659	option_node = add_child_node(param_node, "option", OrderedDict([("value", str(choice))]))
	660	option_node.text = str(choice)
	661
	662	# preselect the default value
	663	if param.default == choice:
	664	option_node.attrib["selected"] = "true"
	665
	666	elif type(param.restrictions) is _NumericRange:
	667	if param.type is not int and param.type is not float:
	668	raise InvalidModelException("Expected either 'int' or 'float' in the numeric range restriction for "
	669	"parameter [%(name)s], but instead got [%(type)s]" %
	670	{"name": param.name, "type": type(param.restrictions)})
	671	# extract the min and max values and add them as attributes
	672	# validate the provided min and max values
	673	if param.restrictions.n_min is not None:
	674	param_node.attrib["min"] = str(param.restrictions.n_min)
	675	if param.restrictions.n_max is not None:
	676	param_node.attrib["max"] = str(param.restrictions.n_max)
	677	elif type(param.restrictions) is _FileFormat:
	678	param_node.attrib["format"] = ','.join([get_supported_file_type(i, supported_file_formats) if
	679	get_supported_file_type(i, supported_file_formats)
	680	else i for i in param.restrictions.formats])
	681	else:
	682	raise InvalidModelException("Unrecognized restriction type [%(type)s] for parameter [%(name)s]"
	683	% {"type": type(param.restrictions), "name": param.name})
	684
	685	if param_type == "select" and param.default in param.restrictions.choices:
	686	param_node.attrib["optional"] = "False"
	687	else:
	688	param_node.attrib["optional"] = str(not param.required)
	689
	690	if param_type == "text":
	691	# add size attribute... this is the length of a textbox field in Galaxy (it could also be 15x2, for instance)
	692	param_node.attrib["size"] = "30"
	693	# add sanitizer nodes, this is needed for special character like "["
	694	# which are used for example by FeatureFinderMultiplex
	695	sanitizer_node = SubElement(param_node, "sanitizer")
	696
	697	valid_node = SubElement(sanitizer_node, "valid", OrderedDict([("initial", "string.printable")]))
	698	add_child_node(valid_node, "remove", OrderedDict([("value", '\'')]))
	699	add_child_node(valid_node, "remove", OrderedDict([("value", '"')]))
	700
	701	# check for default value
	702	if param.default is not None and param.default is not _Null:
	703	if type(param.default) is list:
	704	# we ASSUME that a list of parameters looks like:
	705	# $ tool -ignore He Ar Xe
	706	# meaning, that, for example, Helium, Argon and Xenon will be ignored
	707	param_node.attrib["value"] = ' '.join(map(str, param.default))
	708
	709	elif param_type != "boolean":
	710	param_node.attrib["value"] = str(param.default)
	711
	712	else:
	713	# simple boolean with a default
	714	if param.default is True:
	715	param_node.attrib["checked"] = "true"
	716	else:
	717	if param.type is int or param.type is float:
	718	# galaxy requires "value" to be included for int/float
	719	# since no default was included, we need to figure out one in a clever way... but let the user know
	720	# that we are "thinking" for him/her
	721	logger.warning("Generating default value for parameter [%s]. "
	722	"Galaxy requires the attribute 'value' to be set for integer/floats. "
	723	"Edit the CTD file and provide a suitable default value." % param.name, 1)
	724	# check if there's a min/max and try to use them
	725	default_value = None
	726	if param.restrictions is not None:
	727	if type(param.restrictions) is _NumericRange:
	728	default_value = param.restrictions.n_min
	729	if default_value is None:
	730	default_value = param.restrictions.n_max
	731	if default_value is None:
	732	# no min/max provided... just use 0 and see what happens
	733	default_value = 0
	734	else:
	735	# should never be here, since we have validated this anyway...
	736	# this code is here just for documentation purposes
	737	# however, better safe than sorry!
	738	# (it could be that the code changes and then we have an ugly scenario)
	739	raise InvalidModelException("Expected either a numeric range for parameter [%(name)s], "
	740	"but instead got [%(type)s]"
	741	% {"name": param.name, "type": type(param.restrictions)})
	742	else:
	743	# no restrictions and no default value provided...
	744	# make up something
	745	default_value = 0
	746	param_node.attrib["value"] = str(default_value)
	747
	748	label = "%s parameter" % param.name
	749	help_text = ""
	750
	751	if param.description is not None:
	752	label, help_text = generate_label_and_help(param.description)
	753
	754	param_node.attrib["label"] = label
	755	param_node.attrib["help"] = "(-%s)" % param.name + " " + help_text
	756
	757
	758	def generate_label_and_help(desc):
	759	help_text = ""
	760	# This tag is found in some descriptions
	761	if not isinstance(desc, basestring):
	762	desc = str(desc)
	763	desc = desc.encode("utf8").replace("#br#", " <br>")
	764	# Get rid of dots in the end
	765	if desc.endswith("."):
	766	desc = desc.rstrip(".")
	767	# Check if first word is a normal word and make it uppercase
	768	if str(desc).find(" ") > -1:
	769	first_word, rest = str(desc).split(" ", 1)
	770	if str(first_word).islower():
	771	# check if label has a quotient of the form a/b
	772	if first_word.find("/") != 1 :
	773	first_word.capitalize()
	774	desc = first_word + " " + rest
	775	label = desc.decode("utf8")
	776
	777	# Try to split the label if it is too long
	778	if len(desc) > 50:
	779	# find an example and put everything before in the label and the e.g. in the help
	780	if desc.find("e.g.") > 1 :
	781	label, help_text = desc.split("e.g.",1)
	782	help_text = "e.g." + help_text
	783	else:
	784	# find the end of the first sentence
	785	# look for ". " because some labels contain .file or something similar
	786	delimiter = ""
	787	if desc.find(". ") > 1 and desc.find("? ") > 1:
	788	if desc.find(". ") < desc.find("? "):
	789	delimiter = ". "
	790	else:
	791	delimiter = "? "
	792	elif desc.find(". ") > 1:
	793	delimiter = ". "
	794	elif desc.find("? ") > 1:
	795	delimiter = "? "
	796	if delimiter != "":
	797	label, help_text = desc.split(delimiter, 1)
	798
	799	# add the question mark back
	800	if delimiter == "? ":
	801	label += "? "
	802
	803	# remove all linebreaks
	804	label = label.rstrip().rstrip('<br>').rstrip()
	805	return label, help_text
	806
	807
	808	# determines if the given choices are boolean (basically, if the possible values are yes/no, true/false)
	809	def is_boolean_parameter(param):
	810	# detect boolean selects of OpenMS
	811	if is_selection_parameter(param):
	812	if len(param.restrictions.choices) == 2:
	813	# check that default value is false to make sure it is an actual flag
	814	if "false" in param.restrictions.choices and \
	815	"true" in param.restrictions.choices and \
	816	param.default == "false":
	817	return True
	818	else:
	819	return param.type is bool
	820
	821
	822	# determines if there are choices for the parameter
	823	def is_selection_parameter(param):
	824	return type(param.restrictions) is _Choices
	825
	826
	827	def get_lowercase_list(some_list):
	828	lowercase_list = map(str, some_list)
	829	lowercase_list = map(string.lower, lowercase_list)
	830	lowercase_list = map(strip, lowercase_list)
	831	return lowercase_list
	832
	833
	834	# creates a galaxy boolean parameter type
	835	# this method assumes that param has restrictions, and that only two restictions are present
	836	# (either yes/no or true/false)
	837	def create_boolean_parameter(param_node, param):
	838	# first, determine the 'truevalue' and the 'falsevalue'
	839	"""TODO: true and false values can be way more than 'true' and 'false'
	840	but for that we need CTD support
	841	"""
	842	# by default, 'true' and 'false' are handled as flags, like the verbose flag (i.e., -v)
	843	true_value = "-%s" % get_param_name(param)
	844	false_value = ""
	845	choices = get_lowercase_list(param.restrictions.choices)
	846	if "yes" in choices:
	847	true_value = "yes"
	848	false_value = "no"
	849	param_node.attrib["truevalue"] = true_value
	850	param_node.attrib["falsevalue"] = false_value
	851
	852	# set the checked attribute
	853	if param.default is not None:
	854	checked_value = "false"
	855	default = strip(string.lower(param.default))
	856	if default == "yes" or default == "true":
	857	checked_value = "true"
	858	param_node.attrib["checked"] = checked_value
	859
	860
	861	def create_outputs(parent, model, **kwargs):
	862	outputs_node = add_child_node(parent, "outputs")
	863	parameter_hardcoder = kwargs["parameter_hardcoder"]
	864
	865	for param in extract_parameters(model):
	866
	867	# no need to show hardcoded parameters
	868	hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name)
	869	if param.name in kwargs["blacklisted_parameters"] or hardcoded_value:
	870	# let's not use an extra level of indentation and use NOP
	871	continue
	872	if param.type is _OutFile:
	873	create_output_node(outputs_node, param, model, kwargs["supported_file_formats"])
	874
	875	# If there are no outputs defined in the ctd the node will have no children
	876	# and the stdout will be used as output
	877	if len(outputs_node) == 0:
	878	add_child_node(outputs_node, "data",
	879	OrderedDict([("name", "param_stdout"), ("format", "txt"), ("label", "Output from stdout")]))
	880
	881
	882	def create_output_node(parent, param, model, supported_file_formats):
	883	data_node = add_child_node(parent, "data")
	884	data_node.attrib["name"] = get_galaxy_parameter_name(param)
	885
	886	data_format = "data"
	887	if param.restrictions is not None:
	888	if type(param.restrictions) is _FileFormat:
	889	# set the first data output node to the first file format
	890
	891	# check if there are formats that have not been registered yet...
	892	output = list()
	893	for format_name in param.restrictions.formats:
	894	if not format_name in supported_file_formats.keys():
	895	output.append(str(format_name))
	896
	897	# warn only if there's about to complain
	898	if output:
	899	logger.warning("Parameter " + param.name + " has the following unsupported format(s):"
	900	+ ','.join(output), 1)
	901	data_format = ','.join(output)
	902
	903	formats = get_supported_file_types(param.restrictions.formats, supported_file_formats)
	904	try:
	905	data_format = formats.pop()
	906	except KeyError:
	907	# there is not much we can do, other than catching the exception
	908	pass
	909	# if there are more than one output file formats try to take the format from the input parameter
	910	if formats:
	911	corresponding_input = get_input_with_same_restrictions(param, model, supported_file_formats)
	912	if corresponding_input is not None:
	913	data_format = "input"
	914	data_node.attrib["metadata_source"] = get_galaxy_parameter_name(corresponding_input)
	915	else:
	916	raise InvalidModelException("Unrecognized restriction type [%(type)s] "
	917	"for output [%(name)s]" % {"type": type(param.restrictions),
	918	"name": param.name})
	919	data_node.attrib["format"] = data_format
	920
	921	# TODO: find a smarter label ?
	922	return data_node
	923
	924
	925	# Get the supported file format for one given format
	926	def get_supported_file_type(format_name, supported_file_formats):
	927	if format_name in supported_file_formats.keys():
	928	return supported_file_formats.get(format_name, DataType(format_name, format_name)).galaxy_extension
	929	else:
	930	return None
	931
	932
	933	def get_supported_file_types(formats, supported_file_formats):
	934	return set([supported_file_formats.get(format_name, DataType(format_name, format_name)).galaxy_extension
	935	for format_name in formats if format_name in supported_file_formats.keys()])
	936
	937
	938	def create_change_format_node(parent, data_formats, input_ref):
	939	# <change_format>
	940	# <when input="secondary_structure" value="true" format="txt"/>
	941	# </change_format>
	942	change_format_node = add_child_node(parent, "change_format")
	943	for data_format in data_formats:
	944	add_child_node(change_format_node, "when",
	945	OrderedDict([("input", input_ref), ("value", data_format), ("format", data_format)]))
	946
	947
	948	# Shows basic information about the file, such as data ranges and file type.
	949	def create_help(tool, model):
	950	manual = ''
	951	doc_url = None
	952	if 'manual' in model.opt_attribs.keys():
	953	manual += '%s\n\n' % model.opt_attribs["manual"]
	954	if 'docurl' in model.opt_attribs.keys():
	955	doc_url = model.opt_attribs["docurl"]
	956
	957	help_text = "No help available"
	958	if manual is not None:
	959	help_text = manual
	960	if doc_url is not None:
	961	help_text = ("" if manual is None else manual) + "\nFor more information, visit %s" % doc_url
	962	help_node = add_child_node(tool, "help")
	963	# TODO: do we need CDATA Section here?
	964	help_node.text = help_text
	965
	966
	967	# since a model might contain several ParameterGroup elements,
	968	# we want to simply 'flatten' the parameters to generate the Galaxy wrapper
	969	def extract_parameters(model):
	970	parameters = []
	971	if len(model.parameters.parameters) > 0:
	972	# use this to put parameters that are to be processed
	973	# we know that CTDModel has one parent ParameterGroup
	974	pending = [model.parameters]
	975	while len(pending) > 0:
	976	# take one element from 'pending'
	977	parameter = pending.pop()
	978	if type(parameter) is not ParameterGroup:
	979	parameters.append(parameter)
	980	else:
	981	# append the first-level children of this ParameterGroup
	982	pending.extend(parameter.parameters.values())
	983	# returned the reversed list of parameters (as it is now,
	984	# we have the last parameter in the CTD as first in the list)
	985	return reversed(parameters)
	986
	987
	988	# adds and returns a child node using the given name to the given parent node
	989	def add_child_node(parent_node, child_node_name, attributes=OrderedDict([])):
	990	child_node = SubElement(parent_node, child_node_name, attributes)
	991	return child_node

-2

~~galaxy/dist/conda/bld.bat~~ less more

0		"%PYTHON%" setup.py install
1		if errorlevel 1 exit 1

-1

~~galaxy/dist/conda/build.sh~~ less more

$PYTHON setup.py install

-28

~~galaxy/dist/conda/meta.yaml~~ less more

0		package:
1		name: ctd2galaxy
2		version: "1.0"
3
4		source:
5		git_rev: v1.0
6		git_url: https://github.com/WorkflowConversion/CTD2Galaxy.git
7
8		build:
9		noarch_python: True
10
11		requirements:
12		build:
13		- python
14		- setuptools
15
16		run:
17		- python
18		- lxml
19		- ctdopts 1.0
20
21		test:
22		imports:
23		- CTDopts.CTDopts
24
25		about:
26		home: https://github.com/WorkflowConversion/CTD2Galaxy
27		license_file: LICENSE

-1389

~~galaxy/generator.py~~ less more

0		#!/usr/bin/env python
1		# encoding: utf-8
2
3		"""
4		@author: delagarza
5		"""
6
7
8		import sys
9		import os
10		import traceback
11		import ntpath
12		import string
13
14		from argparse import ArgumentParser
15		from argparse import RawDescriptionHelpFormatter
16		from collections import OrderedDict
17		from string import strip
18		from lxml import etree
19		from lxml.etree import SubElement, Element, ElementTree, ParseError, parse
20
21		from CTDopts.CTDopts import CTDModel, _InFile, _OutFile, ParameterGroup, _Choices, _NumericRange, _FileFormat, \
22		ModelError, _Null
23
24		__all__ = []
25		__version__ = 1.0
26		__date__ = '2014-09-17'
27		__updated__ = '2016-05-09'
28
29		MESSAGE_INDENTATION_INCREMENT = 2
30
31		TYPE_TO_GALAXY_TYPE = {int: 'integer', float: 'float', str: 'text', bool: 'boolean', _InFile: 'data',
32		_OutFile: 'data', _Choices: 'select'}
33
34		STDIO_MACRO_NAME = "stdio"
35		REQUIREMENTS_MACRO_NAME = "requirements"
36		ADVANCED_OPTIONS_MACRO_NAME = "advanced_options"
37
38		REQUIRED_MACROS = [STDIO_MACRO_NAME, REQUIREMENTS_MACRO_NAME, ADVANCED_OPTIONS_MACRO_NAME]
39
40
41		class CLIError(Exception):
42		# Generic exception to raise and log different fatal errors.
43		def __init__(self, msg):
44		super(CLIError).__init__(type(self))
45		self.msg = "E: %s" % msg
46
47		def __str__(self):
48		return self.msg
49
50		def __unicode__(self):
51		return self.msg
52
53
54		class InvalidModelException(ModelError):
55		def __init__(self, message):
56		super(InvalidModelException, self).__init__()
57		self.message = message
58
59		def __str__(self):
60		return self.message
61
62		def __repr__(self):
63		return self.message
64
65
66		class ApplicationException(Exception):
67		def __init__(self, msg):
68		super(ApplicationException).__init__(type(self))
69		self.msg = msg
70
71		def __str__(self):
72		return self.msg
73
74		def __unicode__(self):
75		return self.msg
76
77
78		class ExitCode:
79		def __init__(self, code_range="", level="", description=None):
80		self.range = code_range
81		self.level = level
82		self.description = description
83
84
85		class DataType:
86		def __init__(self, extension, galaxy_extension=None, galaxy_type=None, mimetype=None):
87		self.extension = extension
88		self.galaxy_extension = galaxy_extension
89		self.galaxy_type = galaxy_type
90		self.mimetype = mimetype
91
92
93		class ParameterHardcoder:
94		def __init__(self):
95		# map whose keys are the composite names of tools and parameters in the following pattern:
96		# [ToolName][separator][ParameterName] -> HardcodedValue
97		# if the parameter applies to all tools, then the following pattern is used:
98		# [ParameterName] -> HardcodedValue
99
100		# examples (assuming separator is '#'):
101		# threads -> 24
102		# XtandemAdapter#adapter -> xtandem.exe
103		# adapter -> adapter.exe
104		self.separator = "!"
105		self.parameter_map = {}
106
107		# the most specific value will be returned in case of overlap
108		def get_hardcoded_value(self, parameter_name, tool_name):
109		# look for the value that would apply for all tools
110		generic_value = self.parameter_map.get(parameter_name, None)
111		specific_value = self.parameter_map.get(self.build_key(parameter_name, tool_name), None)
112		if specific_value is not None:
113		return specific_value
114
115		return generic_value
116
117		def register_parameter(self, parameter_name, parameter_value, tool_name=None):
118		self.parameter_map[self.build_key(parameter_name, tool_name)] = parameter_value
119
120		def build_key(self, parameter_name, tool_name):
121		if tool_name is None:
122		return parameter_name
123		return "%s%s%s" % (parameter_name, self.separator, tool_name)
124
125
126		def main(argv=None): # IGNORE:C0111
127		# Command line options.
128		if argv is None:
129		argv = sys.argv
130		else:
131		sys.argv.extend(argv)
132
133		program_version = "v%s" % __version__
134		program_build_date = str(__updated__)
135		program_version_message = '%%(prog)s %s (%s)' % (program_version, program_build_date)
136		program_short_description = "CTD2Galaxy - A project from the GenericWorkflowNodes family " \
137		"(https://github.com/orgs/genericworkflownodes)"
138		program_usage = '''
139		USAGE:
140
141		I - Parsing a single CTD file and generate a Galaxy wrapper:
142
143		$ python generator.py -i input.ctd -o output.xml
144
145
146		II - Parsing all found CTD files (files with .ctd and .xml extension) in a given folder and
147		output converted Galaxy wrappers in a given folder:
148
149		$ python generator.py -i /home/user/*.ctd -o /home/user/galaxywrappers
150
151
152		III - Providing file formats, mimetypes
153
154		Galaxy supports the concept of file format in order to connect compatible ports, that is, input ports of a certain
155		data format will be able to receive data from a port from the same format. This converter allows you to provide
156		a personalized file in which you can relate the CTD data formats with supported Galaxy data formats. The layout of
157		this file consists of lines, each of either one or four columns separated by any amount of whitespace. The content
158		of each column is as follows:
159
160		* 1st column: file extension
161		* 2nd column: data type, as listed in Galaxy
162		* 3rd column: full-named Galaxy data type, as it will appear on datatypes_conf.xml
163		* 4th column: mimetype (optional)
164
165		The following is an example of a valid "file formats" file:
166
167		########################################## FILE FORMATS example ##########################################
168		# Every line starting with a # will be handled as a comment and will not be parsed.
169		# The first column is the file format as given in the CTD and second column is the Galaxy data format.
170		# The second, third, fourth and fifth column can be left empty if the data type has already been registered
171		# in Galaxy, otherwise, all but the mimetype must be provided.
172
173		# CTD type # Galaxy type # Long Galaxy data type # Mimetype
174		csv tabular galaxy.datatypes.data:Text
175		fasta
176		ini txt galaxy.datatypes.data:Text
177		txt
178		idxml txt galaxy.datatypes.xml:GenericXml application/xml
179		options txt galaxy.datatypes.data:Text
180		grid grid galaxy.datatypes.data:Grid
181
182		##########################################################################################################
183
184		Note that each line consists precisely of either one, three or four columns. In the case of data types already
185		registered in Galaxy (such as fasta and txt in the above example), only the first column is needed. In the case of
186		data types that haven't been yet registered in Galaxy, the first three columns are needed (mimetype is optional).
187
188		For information about Galaxy data types and subclasses, see the following page:
189		https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
190
191
192		IV - Hardcoding parameters
193
194		It is possible to hardcode parameters. This makes sense if you want to set a tool in Galaxy in 'quiet' mode or if
195		your tools support multi-threading and accept the number of threads via a parameter, without giving the end user the
196		chance to change the values for these parameters.
197
198		In order to generate hardcoded parameters, you need to provide a simple file. Each line of this file contains two
199		or three columns separated by whitespace. Any line starting with a '#' will be ignored. The first column contains
200		the name of the parameter, the second column contains the value that will always be set for this parameter. The
201		first two columns are mandatory.
202
203		If the parameter is to be hardcoded only for a set of tools, then a third column can be added. This column includes
204		a comma-separated list of tool names for which the parameter will be hardcoded. If a third column is not included,
205		then all processed tools containing the given parameter will get a hardcoded value for it.
206
207		The following is an example of a valid file:
208
209		##################################### HARDCODED PARAMETERS example #####################################
210		# Every line starting with a # will be handled as a comment and will not be parsed.
211		# The first column is the name of the parameter and the second column is the value that will be used.
212
213		# Parameter name # Value # Tool(s)
214		threads \${GALAXY_SLOTS:-24}
215		mode quiet
216		xtandem_executable xtandem XTandemAdapter
217		verbosity high Foo, Bar
218
219		#########################################################################################################
220
221		Using the above file will produce a <command> similar to:
222
223		[tool_name] ... -threads \${GALAXY_SLOTS:-24} -mode quiet ...
224
225		For all tools. For XTandemAdapter, the <command> will be similar to:
226
227		XtandemAdapter ... -threads \${GALAXY_SLOTS:-24} -mode quiet -xtandem_executable xtandem ...
228
229		And for tools Foo and Bar, the <command> will be similar to:
230
231		Foo ... ... -threads \${GALAXY_SLOTS:-24} -mode quiet -verbosity high ...
232
233
234		V - Control which tools will be converted
235
236		Sometimes only a subset of CTDs needs to be converted. It is possible to either explicitly specify which tools will
237		be converted or which tools will not be converted.
238
239		The value of the -s/--skip-tools parameter is a file in which each line will be interpreted as the name of a tool
240		that will not be converted. Conversely, the value of the -r/--required-tools is a file in which each line will be
241		interpreted as a tool that is required. Only one of these parameters can be specified at a given time.
242
243		The format of both files is exactly the same. As stated before, each line will be interpreted as the name of a tool;
244		any line starting with a '#' will be ignored.
245
246		'''
247		program_license = '''%(short_description)s
248		Copyright 2015, Luis de la Garza
249
250		Licensed under the Apache License, Version 2.0 (the "License");
251		you may not use this file except in compliance with the License.
252		You may obtain a copy of the License at
253
254		http://www.apache.org/licenses/LICENSE-2.0
255
256		Unless required by applicable law or agreed to in writing, software
257		distributed under the License is distributed on an "AS IS" BASIS,
258		WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
259		See the License for the specific language governing permissions and
260		limitations under the License.
261
262		%(usage)s
263		''' % {'short_description': program_short_description, 'usage': program_usage}
264
265		try:
266		# Setup argument parser
267		parser = ArgumentParser(prog="CTD2Galaxy", description=program_license,
268		formatter_class=RawDescriptionHelpFormatter, add_help=True)
269		parser.add_argument("-i", "--input", dest="input_files", default=[], required=True, nargs="+", action="append",
270		help="List of CTD files to convert.")
271		parser.add_argument("-o", "--output-destination", dest="output_destination", required=True,
272		help="If multiple input files are given, then a folder in which all generated "
273		"XMLs will be generated is expected;"
274		"if a single input file is given, then a destination file is expected.")
275		parser.add_argument("-f", "--formats-file", dest="formats_file",
276		help="File containing the supported file formats. Run with '-h' or '--help' to see a "
277		"brief example on the layout of this file.", default=None, required=False)
278		parser.add_argument("-a", "--add-to-command-line", dest="add_to_command_line",
279		help="Adds content to the command line", default="", required=False)
280		parser.add_argument("-d", "--datatypes-destination", dest="data_types_destination",
281		help="Specify the location of a datatypes_conf.xml to modify and add the registered "
282		"data types. If the provided destination does not exist, a new file will be created.",
283		default=None, required=False)
284		parser.add_argument("-x", "--default-executable-path", dest="default_executable_path",
285		help="Use this executable path when <executablePath> is not present in the CTD",
286		default=None, required=False)
287		parser.add_argument("-b", "--blacklist-parameters", dest="blacklisted_parameters", default=[], nargs="+", action="append",
288		help="List of parameters that will be ignored and won't appear on the galaxy stub",
289		required=False)
290		parser.add_argument("-c", "--default-category", dest="default_category", default="DEFAULT", required=False,
291		help="Default category to use for tools lacking a category when generating tool_conf.xml")
292		parser.add_argument("-t", "--tool-conf-destination", dest="tool_conf_destination", default=None, required=False,
293		help="Specify the location of an existing tool_conf.xml that will be modified to include "
294		"the converted tools. If the provided destination does not exist, a new file will"
295		"be created.")
296		parser.add_argument("-g", "--galaxy-tool-path", dest="galaxy_tool_path", default=None, required=False,
297		help="The path that will be prepended to the file names when generating tool_conf.xml")
298		parser.add_argument("-r", "--required-tools", dest="required_tools_file", default=None, required=False,
299		help="Each line of the file will be interpreted as a tool name that needs translation. "
300		"Run with '-h' or '--help' to see a brief example on the format of this file.")
301		parser.add_argument("-s", "--skip-tools", dest="skip_tools_file", default=None, required=False,
302		help="File containing a list of tools for which a Galaxy stub will not be generated. "
303		"Run with '-h' or '--help' to see a brief example on the format of this file.")
304		parser.add_argument("-m", "--macros", dest="macros_files", default=[], nargs="*",
305		action="append", required=None, help="Import the additional given file(s) as macros. "
306		"The macros stdio, requirements and advanced_options are required. Please see "
307		"macros.xml for an example of a valid macros file. Al defined macros will be imported.")
308		parser.add_argument("-p", "--hardcoded-parameters", dest="hardcoded_parameters", default=None, required=False,
309		help="File containing hardcoded values for the given parameters. Run with '-h' or '--help' "
310		"to see a brief example on the format of this file.")
311		parser.add_argument("-v", "--validation-schema", dest="xsd_location", default=None, required=False,
312		help="Location of the schema to use to validate CTDs.")
313
314		# TODO: add verbosity, maybe?
315		parser.add_argument("-V", "--version", action='version', version=program_version_message)
316
317		# Process arguments
318		args = parser.parse_args()
319
320		# validate and prepare the passed arguments
321		validate_and_prepare_args(args)
322
323		# extract the names of the macros and check that we have found the ones we need
324		macros_to_expand = parse_macros_files(args.macros_files)
325
326		# parse the given supported file-formats file
327		supported_file_formats = parse_file_formats(args.formats_file)
328
329		# parse the hardcoded parameters file¬
330		parameter_hardcoder = parse_hardcoded_parameters(args.hardcoded_parameters)
331
332		# parse the skip/required tools files
333		skip_tools = parse_tools_list_file(args.skip_tools_file)
334		required_tools = parse_tools_list_file(args.required_tools_file)
335
336		#if verbose > 0:
337		# print("Verbose mode on")
338		parsed_models = convert(args.input_files,
339		args.output_destination,
340		supported_file_formats=supported_file_formats,
341		default_executable_path=args.default_executable_path,
342		add_to_command_line=args.add_to_command_line,
343		blacklisted_parameters=args.blacklisted_parameters,
344		required_tools=required_tools,
345		skip_tools=skip_tools,
346		macros_file_names=args.macros_files,
347		macros_to_expand=macros_to_expand,
348		parameter_hardcoder=parameter_hardcoder,
349		xsd_location=args.xsd_location)
350
351		#TODO: add some sort of warning if a macro that doesn't exist is to be expanded
352
353		# it is not needed to copy the macros files, since the user has provided them
354
355		# generation of galaxy stubs is ready... now, let's see if we need to generate a tool_conf.xml
356		if args.tool_conf_destination is not None:
357		generate_tool_conf(parsed_models, args.tool_conf_destination,
358		args.galaxy_tool_path, args.default_category)
359
360		# now datatypes_conf.xml
361		if args.data_types_destination is not None:
362		generate_data_type_conf(supported_file_formats, args.data_types_destination)
363
364		return 0
365
366		except KeyboardInterrupt:
367		# handle keyboard interrupt
368		return 0
369		except ApplicationException, e:
370		error("CTD2Galaxy could not complete the requested operation.", 0)
371		error("Reason: " + e.msg, 0)
372		return 1
373		except ModelError, e:
374		error("There seems to be a problem with one of your input CTDs.", 0)
375		error("Reason: " + e.msg, 0)
376		return 1
377		except Exception, e:
378		traceback.print_exc()
379		return 2
380
381
382		def parse_tools_list_file(tools_list_file):
383		tools_list = None
384		if tools_list_file is not None:
385		tools_list = []
386		with open(tools_list_file) as f:
387		for line in f:
388		if line is None or not line.strip() or line.strip().startswith("#"):
389		continue
390		else:
391		tools_list.append(line.strip())
392
393		return tools_list
394
395
396		def parse_macros_files(macros_file_names):
397		macros_to_expand = set()
398
399		for macros_file_name in macros_file_names:
400		try:
401		macros_file = open(macros_file_name)
402		info("Loading macros from %s" % macros_file_name, 0)
403		root = parse(macros_file).getroot()
404		for xml_element in root.findall("xml"):
405		name = xml_element.attrib["name"]
406		if name in macros_to_expand:
407		warning("Macro %s has already been found. Duplicate found in file %s." %
408		(name, macros_file_name), 0)
409		else:
410		info("Macro %s found" % name, 1)
411		macros_to_expand.add(name)
412		except ParseError, e:
413		raise ApplicationException("The macros file " + macros_file_name + " could not be parsed. Cause: " +
414		str(e))
415		except IOError, e:
416		raise ApplicationException("The macros file " + macros_file_name + " could not be opened. Cause: " +
417		str(e))
418
419		# we depend on "stdio", "requirements" and "advanced_options" to exist on all the given macros files
420		missing_needed_macros = []
421		for required_macro in REQUIRED_MACROS:
422		if required_macro not in macros_to_expand:
423		missing_needed_macros.append(required_macro)
424
425		if missing_needed_macros:
426		raise ApplicationException(
427		"The following required macro(s) were not found in any of the given macros files: %s, "
428		"see sample_files/macros.xml for an example of a valid macros file."
429		% ", ".join(missing_needed_macros))
430
431		# we do not need to "expand" the advanced_options macro
432		macros_to_expand.remove(ADVANCED_OPTIONS_MACRO_NAME)
433		return macros_to_expand
434
435
436		def parse_hardcoded_parameters(hardcoded_parameters_file):
437		parameter_hardcoder = ParameterHardcoder()
438		if hardcoded_parameters_file is not None:
439		line_number = 0
440		with open(hardcoded_parameters_file) as f:
441		for line in f:
442		line_number += 1
443		if line is None or not line.strip() or line.strip().startswith("#"):
444		pass
445		else:
446		# the third column must not be obtained as a whole, and not split
447		parsed_hardcoded_parameter = line.strip().split(None, 2)
448		# valid lines contain two or three columns
449		if len(parsed_hardcoded_parameter) != 2 and len(parsed_hardcoded_parameter) != 3:
450		warning("Invalid line at line number %d of the given hardcoded parameters file. Line will be"
451		"ignored:\n%s" % (line_number, line), 0)
452		continue
453
454		parameter_name = parsed_hardcoded_parameter[0]
455		hardcoded_value = parsed_hardcoded_parameter[1]
456		tool_names = None
457		if len(parsed_hardcoded_parameter) == 3:
458		tool_names = parsed_hardcoded_parameter[2].split(',')
459		if tool_names:
460		for tool_name in tool_names:
461		parameter_hardcoder.register_parameter(parameter_name, hardcoded_value, tool_name.strip())
462		else:
463		parameter_hardcoder.register_parameter(parameter_name, hardcoded_value)
464
465		return parameter_hardcoder
466
467
468		def parse_file_formats(formats_file):
469		supported_formats = {}
470		if formats_file is not None:
471		line_number = 0
472		with open(formats_file) as f:
473		for line in f:
474		line_number += 1
475		if line is None or not line.strip() or line.strip().startswith("#"):
476		# ignore (it'd be weird to have something like:
477		# if line is not None and not (not line.strip()) ...
478		pass
479		else:
480		# not an empty line, no comment
481		# strip the line and split by whitespace
482		parsed_formats = line.strip().split()
483		# valid lines contain either one or four columns
484		if not (len(parsed_formats) == 1 or len(parsed_formats) == 3 or len(parsed_formats) == 4):
485		warning("Invalid line at line number %d of the given formats file. Line will be ignored:\n%s" %
486		(line_number, line), 0)
487		# ignore the line
488		continue
489		elif len(parsed_formats) == 1:
490		supported_formats[parsed_formats[0]] = DataType(parsed_formats[0], parsed_formats[0])
491		else:
492		mimetype = None
493		# check if mimetype was provided
494		if len(parsed_formats) == 4:
495		mimetype = parsed_formats[3]
496		supported_formats[parsed_formats[0]] = DataType(parsed_formats[0], parsed_formats[1],
497		parsed_formats[2], mimetype)
498		return supported_formats
499
500
501		def validate_and_prepare_args(args):
502		# check that only one of skip_tools_file and required_tools_file has been provided
503		if args.skip_tools_file is not None and args.required_tools_file is not None:
504		raise ApplicationException(
505		"You have provided both a file with tools to ignore and a file with required tools.\n"
506		"Only one of -s/--skip-tools, -r/--required-tools can be provided.")
507
508		# first, we convert all list of lists in args to flat lists
509		lists_to_flatten = ["input_files", "blacklisted_parameters", "macros_files"]
510		for list_to_flatten in lists_to_flatten:
511		setattr(args, list_to_flatten, [item for sub_list in getattr(args, list_to_flatten) for item in sub_list])
512
513		# if input is a single file, we expect output to be a file (and not a dir that already exists)
514		if len(args.input_files) == 1:
515		if os.path.isdir(args.output_destination):
516		raise ApplicationException("If a single input file is provided, output (%s) is expected to be a file "
517		"and not a folder.\n" % args.output_destination)
518
519		# if input is a list of files, we expect output to be a folder
520		if len(args.input_files) > 1:
521		if not os.path.isdir(args.output_destination):
522		raise ApplicationException("If several input files are provided, output (%s) is expected to be an "
523		"existing directory.\n" % args.output_destination)
524
525		# check that the provided input files, if provided, contain a valid file path
526		input_variables_to_check = ["skip_tools_file", "required_tools_file", "macros_files", "xsd_location",
527		"input_files", "formats_file", "hardcoded_parameters"]
528
529		for variable_name in input_variables_to_check:
530		paths_to_check = []
531		# check if we are handling a single file or a list of files
532		member_value = getattr(args, variable_name)
533		if member_value is not None:
534		if isinstance(member_value, list):
535		for file_name in member_value:
536		paths_to_check.append(strip(str(file_name)))
537		else:
538		paths_to_check.append(strip(str(member_value)))
539
540		for path_to_check in paths_to_check:
541		if not os.path.isfile(path_to_check) or not os.path.exists(path_to_check):
542		raise ApplicationException(
543		"The provided input file (%s) does not exist or is not a valid file path."
544		% path_to_check)
545
546		# check that the provided output files, if provided, contain a valid file path (i.e., not a folder)
547		output_variables_to_check = ["data_types_destination", "tool_conf_destination"]
548
549		for variable_name in output_variables_to_check:
550		file_name = getattr(args, variable_name)
551		if file_name is not None and os.path.isdir(file_name):
552		raise ApplicationException("The provided output file name (%s) points to a directory." % file_name)
553
554		if not args.macros_files:
555		# list is empty, provide the default value
556		warning("Using default macros from macros.xml", 0)
557		args.macros_files = ["macros.xml"]
558
559
560		def convert(input_files, output_destination, **kwargs):
561		# first, generate a model
562		is_converting_multiple_ctds = len(input_files) > 1
563		parsed_models = []
564		schema = None
565		if kwargs["xsd_location"] is not None:
566		try:
567		info("Loading validation schema from %s" % kwargs["xsd_location"], 0)
568		schema = etree.XMLSchema(etree.parse(kwargs["xsd_location"]))
569		except Exception, e:
570		error("Could not load validation schema %s. Reason: %s" % (kwargs["xsd_location"], str(e)), 0)
571		else:
572		info("Validation against a schema has not been enabled.", 0)
573		for input_file in input_files:
574		try:
575		if schema is not None:
576		validate_against_schema(input_file, schema)
577		model = CTDModel(from_file=input_file)
578		except Exception, e:
579		error(str(e), 1)
580		continue
581
582		if kwargs["skip_tools"] is not None and model.name in kwargs["skip_tools"]:
583		info("Skipping tool %s" % model.name, 0)
584		continue
585		elif kwargs["required_tools"] is not None and model.name not in kwargs["required_tools"]:
586		info("Tool %s is not required, skipping it" % model.name, 0)
587		continue
588		else:
589		info("Converting from %s " % input_file, 0)
590		tool = create_tool(model)
591		write_header(tool, model)
592		create_description(tool, model)
593		expand_macros(tool, model, **kwargs)
594		create_command(tool, model, **kwargs)
595		create_inputs(tool, model, **kwargs)
596		create_outputs(tool, model, **kwargs)
597		create_help(tool, model)
598
599		# finally, serialize the tool
600		output_file = output_destination
601		# if multiple inputs are being converted,
602		# then we need to generate a different output_file for each input
603		if is_converting_multiple_ctds:
604		output_file = os.path.join(output_file, get_filename_without_suffix(input_file) + ".xml")
605		# wrap our tool element into a tree to be able to serialize it
606		tree = ElementTree(tool)
607		tree.write(open(output_file, 'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
608		# let's use model to hold the name of the output file
609		parsed_models.append([model, get_filename(output_file)])
610
611		return parsed_models
612
613
614		# validates a ctd file against the schema
615		def validate_against_schema(ctd_file, schema):
616		try:
617		parser = etree.XMLParser(schema=schema)
618		etree.parse(ctd_file, parser=parser)
619		except etree.XMLSyntaxError, e:
620		raise ApplicationException("Input ctd file %s is not valid. Reason: %s" % (ctd_file, str(e)))
621
622
623		def write_header(tool, model):
624		tool.addprevious(etree.Comment(
625		"This is a configuration file for the integration of a tools into Galaxy (https://galaxyproject.org/). "
626		"This file was automatically generated using CTD2Galaxy."))
627		tool.addprevious(etree.Comment('Proposed Tool Section: [%s]' % model.opt_attribs.get("category", "")))
628
629
630		def generate_tool_conf(parsed_models, tool_conf_destination, galaxy_tool_path, default_category):
631		# for each category, we keep a list of models corresponding to it
632		categories_to_tools = dict()
633		for model in parsed_models:
634		category = strip(model[0].opt_attribs.get("category", ""))
635		if not category.strip():
636		category = default_category
637		if category not in categories_to_tools:
638		categories_to_tools[category] = []
639		categories_to_tools[category].append(model[1])
640
641		# at this point, we should have a map for all categories->tools
642		toolbox_node = Element("toolbox")
643
644		if galaxy_tool_path is not None and not galaxy_tool_path.strip().endswith("/"):
645		galaxy_tool_path = galaxy_tool_path.strip() + "/"
646		if galaxy_tool_path is None:
647		galaxy_tool_path = ""
648
649		for category, file_names in categories_to_tools.iteritems():
650		section_node = add_child_node(toolbox_node, "section")
651		section_node.attrib["id"] = "section-id-" + "".join(category.split())
652		section_node.attrib["name"] = category
653
654		for filename in file_names:
655		tool_node = add_child_node(section_node, "tool")
656		tool_node.attrib["file"] = galaxy_tool_path + filename
657
658		toolconf_tree = ElementTree(toolbox_node)
659		toolconf_tree.write(open(tool_conf_destination,'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
660		info("Generated Galaxy tool_conf.xml in %s" % tool_conf_destination, 0)
661
662
663		def generate_data_type_conf(supported_file_formats, data_types_destination):
664		data_types_node = Element("datatypes")
665		registration_node = add_child_node(data_types_node, "registration")
666		registration_node.attrib["converters_path"] = "lib/galaxy/datatypes/converters"
667		registration_node.attrib["display_path"] = "display_applications"
668
669		for format_name in supported_file_formats:
670		data_type = supported_file_formats[format_name]
671		# add only if it's a data type that does not exist in Galaxy
672		if data_type.galaxy_type is not None:
673		data_type_node = add_child_node(registration_node, "datatype")
674		# we know galaxy_extension is not None
675		data_type_node.attrib["extension"] = data_type.galaxy_extension
676		data_type_node.attrib["type"] = data_type.galaxy_type
677		if data_type.mimetype is not None:
678		data_type_node.attrib["mimetype"] = data_type.mimetype
679
680		data_types_tree = ElementTree(data_types_node)
681		data_types_tree.write(open(data_types_destination,'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
682		info("Generated Galaxy datatypes_conf.xml in %s" % data_types_destination, 0)
683
684
685		# taken from
686		# http://stackoverflow.com/questions/8384737/python-extract-file-name-from-path-no-matter-what-the-os-path-format
687		def get_filename(path):
688		head, tail = ntpath.split(path)
689		return tail or ntpath.basename(head)
690
691
692		def get_filename_without_suffix(path):
693		root, ext = os.path.splitext(os.path.basename(path))
694		return root
695
696
697		def create_tool(model):
698		return Element("tool", OrderedDict([("id", model.name), ("name", model.name), ("version", model.version)]))
699
700
701		def create_description(tool, model):
702		if "description" in model.opt_attribs.keys() and model.opt_attribs["description"] is not None:
703		description = SubElement(tool,"description")
704		description.text = model.opt_attribs["description"]
705
706
707		def get_param_cli_name(param, model):
708		# we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy)
709		if type(param.parent) == ParameterGroup:
710		if not hasattr(param.parent.parent, 'parent'):
711		return resolve_param_mapping(param, model)
712		elif not hasattr(param.parent.parent.parent, 'parent'):
713		return resolve_param_mapping(param, model)
714		else:
715		if model.cli:
716		warning("Using nested parameter sections (NODE elements) is not compatible with <cli>", py1)
717		return get_param_name(param.parent) + ":" + resolve_param_mapping(param, model)
718		else:
719		return resolve_param_mapping(param, model)
720
721
722		def get_param_name(param):
723		# we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy)
724		if type(param.parent) == ParameterGroup:
725		if not hasattr(param.parent.parent, 'parent'):
726		return param.name
727		elif not hasattr(param.parent.parent.parent, 'parent'):
728		return param.name
729		else:
730		return get_param_name(param.parent) + ":" + param.name
731		else:
732		return param.name
733
734
735		# some parameters are mapped to command line options, this method helps resolve those mappings, if any
736		def resolve_param_mapping(param, model):
737		# go through all mappings and find if the given param appears as a reference name in a mapping element
738		param_mapping = None
739		for cli_element in model.cli:
740		for mapping_element in cli_element.mappings:
741		if mapping_element.reference_name == param.name:
742		if param_mapping is not None:
743		warning("The parameter %s has more than one mapping in the <cli> section. "
744		"The first found mapping, %s, will be used." % (param.name, param_mapping), 1)
745		else:
746		param_mapping = cli_element.option_identifier
747
748		return param_mapping if param_mapping is not None else param.name
749
750		def create_command(tool, model, **kwargs):
751		final_command = get_tool_executable_path(model, kwargs["default_executable_path"]) + '\n'
752		final_command += kwargs["add_to_command_line"] + '\n'
753		advanced_command_start = "#if $adv_opts.adv_opts_selector=='advanced':\n"
754		advanced_command_end = '#end if'
755		advanced_command = ''
756		parameter_hardcoder = kwargs["parameter_hardcoder"]
757
758		found_output_parameter = False
759		for param in extract_parameters(model):
760		if param.type is _OutFile:
761		found_output_parameter = True
762		command = ''
763		param_name = get_param_name(param)
764		param_cli_name = get_param_cli_name(param, model)
765		if param_name == param_cli_name:
766		# there was no mapping, so for the cli name we will use a '-' in the prefix
767		param_cli_name = '-' + param_name
768
769		if param.name in kwargs["blacklisted_parameters"]:
770		continue
771
772		hardcoded_value = parameter_hardcoder.get_hardcoded_value(param_name, model.name)
773		if hardcoded_value:
774		command += '%s %s\n' % (param_cli_name, hardcoded_value)
775		else:
776		# parameter is neither blacklisted nor hardcoded...
777		galaxy_parameter_name = get_galaxy_parameter_name(param)
778		repeat_galaxy_parameter_name = get_repeat_galaxy_parameter_name(param)
779
780		# logic for ITEMLISTs
781		if param.is_list:
782		if param.type is _InFile:
783		command += param_cli_name + "\n"
784		command += " #for token in $" + galaxy_parameter_name + ":\n"
785		command += " $token\n"
786		command += " #end for\n"
787		else:
788		command += "\n#if $" + repeat_galaxy_parameter_name + ":\n"
789		command += param_cli_name + "\n"
790		command += " #for token in $" + repeat_galaxy_parameter_name + ":\n"
791		command += " #if \" \" in str(token):\n"
792		command += " \"$token." + galaxy_parameter_name + "\"\n"
793		command += " #else\n"
794		command += " $token." + galaxy_parameter_name + "\n"
795		command += " #end if\n"
796		command += " #end for\n"
797		command += "#end if\n"
798		# logic for other ITEMs
799		else:
800		if param.advanced and param.type is not _OutFile:
801		actual_parameter = "$adv_opts.%s" % galaxy_parameter_name
802		else:
803		actual_parameter = "$%s" % galaxy_parameter_name
804		## if whitespace_validation has been set, we need to generate, for each parameter:
805		## #if str( $t ).split() != '':
806		## -t "$t"
807		## #end if
808		## TODO only useful for text fields, integers or floats
809		## not useful for choices, input fields ...
810
811		if not is_boolean_parameter(param) and type(param.restrictions) is _Choices :
812		command += "#if " + actual_parameter + ":\n"
813		command += ' %s\n' % param_cli_name
814		command += " #if \" \" in str(" + actual_parameter + "):\n"
815		command += " \"" + actual_parameter + "\"\n"
816		command += " #else\n"
817		command += " " + actual_parameter + "\n"
818		command += " #end if\n"
819		command += "#end if\n"
820		elif is_boolean_parameter(param):
821		command += "#if " + actual_parameter + ":\n"
822		command += ' %s\n' % param_cli_name
823		command += "#end if\n"
824		elif TYPE_TO_GALAXY_TYPE[param.type] is 'text':
825		command += "#if " + actual_parameter + ":\n"
826		command += " %s " % param_cli_name
827		command += " \"" + actual_parameter + "\"\n"
828		command += "#end if\n"
829		else:
830		command += "#if " + actual_parameter + ":\n"
831		command += ' %s ' % param_cli_name
832		command += actual_parameter + "\n"
833		command += "#end if\n"
834
835		if param.advanced and param.type is not _OutFile:
836		advanced_command += " %s" % command
837		else:
838		final_command += command
839
840		if advanced_command:
841		final_command += "%s%s%s\n" % (advanced_command_start, advanced_command, advanced_command_end)
842
843		if not found_output_parameter:
844		final_command += "> $param_stdout\n"
845
846		command_node = add_child_node(tool, "command")
847		command_node.text = final_command
848
849
850		# creates the xml elements needed to import the needed macros files
851		# and to "expand" the macros
852		def expand_macros(tool, model, **kwargs):
853		macros_node = add_child_node(tool, "macros")
854		token_node = add_child_node(macros_node, "token")
855		token_node.attrib["name"] = "@EXECUTABLE@"
856		token_node.text = get_tool_executable_path(model, kwargs["default_executable_path"])
857
858		# add <import> nodes
859		for macro_file_name in kwargs["macros_file_names"]:
860		macro_file = open(macro_file_name)
861		import_node = add_child_node(macros_node, "import")
862		# do not add the path of the file, rather, just its basename
863		import_node.text = os.path.basename(macro_file.name)
864
865		# add <expand> nodes
866		for expand_macro in kwargs["macros_to_expand"]:
867		expand_node = add_child_node(tool, "expand")
868		expand_node.attrib["macro"] = expand_macro
869
870
871		def get_tool_executable_path(model, default_executable_path):
872		# rules to build the galaxy executable path:
873		# if executablePath is null, then use default_executable_path and store it in executablePath
874		# if executablePath is null and executableName is null, then the name of the tool will be used
875		# if executablePath is null and executableName is not null, then executableName will be used
876		# if executablePath is not null and executableName is null,
877		# then executablePath and the name of the tool will be used
878		# if executablePath is not null and executableName is not null, then both will be used
879
880		# first, check if the model has executablePath / executableName defined
881		executable_path = model.opt_attribs.get("executablePath", None)
882		executable_name = model.opt_attribs.get("executableName", None)
883
884		# check if we need to use the default_executable_path
885		if executable_path is None:
886		executable_path = default_executable_path
887
888		# fix the executablePath to make sure that there is a '/' in the end
889		if executable_path is not None:
890		executable_path = executable_path.strip()
891		if not executable_path.endswith('/'):
892		executable_path += '/'
893
894		# assume that we have all information present
895		command = str(executable_path) + str(executable_name)
896		if executable_path is None:
897		if executable_name is None:
898		command = model.name
899		else:
900		command = executable_name
901		else:
902		if executable_name is None:
903		command = executable_path + model.name
904		return command
905
906
907		def get_galaxy_parameter_name(param):
908		return "param_%s" % get_param_name(param).replace(':', '_').replace('-', '_')
909
910
911		def get_input_with_same_restrictions(out_param, model, supported_file_formats):
912		for param in extract_parameters(model):
913		if param.type is _InFile:
914		if param.restrictions is not None:
915		in_param_formats = get_supported_file_types(param.restrictions.formats, supported_file_formats)
916		out_param_formats = get_supported_file_types(out_param.restrictions.formats, supported_file_formats)
917		if in_param_formats == out_param_formats:
918		return param
919
920
921		def create_inputs(tool, model, **kwargs):
922		inputs_node = SubElement(tool, "inputs")
923
924		# some suites (such as OpenMS) need some advanced options when handling inputs
925		expand_advanced_node = add_child_node(tool, "expand", OrderedDict([("macro", ADVANCED_OPTIONS_MACRO_NAME)]))
926		parameter_hardcoder = kwargs["parameter_hardcoder"]
927
928		# treat all non output-file parameters as inputs
929		for param in extract_parameters(model):
930		# no need to show hardcoded parameters
931		hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name)
932		if param.name in kwargs["blacklisted_parameters"] or hardcoded_value:
933		# let's not use an extra level of indentation and use NOP
934		continue
935		if param.type is not _OutFile:
936		if param.advanced:
937		if expand_advanced_node is not None:
938		parent_node = expand_advanced_node
939		else:
940		# something went wrong... we are handling an advanced parameter and the
941		# advanced input macro was not set... inform the user about it
942		info("The parameter %s has been set as advanced, but advanced_input_macro has "
943		"not been set." % param.name, 1)
944		# there is not much we can do, other than use the inputs_node as a parent node!
945		parent_node = inputs_node
946		else:
947		parent_node = inputs_node
948
949		# for lists we need a repeat tag
950		if param.is_list and param.type is not _InFile:
951		rep_node = add_child_node(parent_node, "repeat")
952		create_repeat_attribute_list(rep_node, param)
953		parent_node = rep_node
954
955		param_node = add_child_node(parent_node, "param")
956		create_param_attribute_list(param_node, param, kwargs["supported_file_formats"])
957
958		# advanced parameter selection should be at the end
959		# and only available if an advanced parameter exists
960		if expand_advanced_node is not None and len(expand_advanced_node) > 0:
961		inputs_node.append(expand_advanced_node)
962
963
964		def get_repeat_galaxy_parameter_name(param):
965		return "rep_" + get_galaxy_parameter_name(param)
966
967
968		def create_repeat_attribute_list(rep_node, param):
969		rep_node.attrib["name"] = get_repeat_galaxy_parameter_name(param)
970		if param.required:
971		rep_node.attrib["min"] = "1"
972		else:
973		rep_node.attrib["min"] = "0"
974		# for the ITEMLISTs which have LISTITEM children we only
975		# need one parameter as it is given as a string
976		if param.default is not None:
977		rep_node.attrib["max"] = "1"
978		rep_node.attrib["title"] = get_galaxy_parameter_name(param)
979
980
981		def create_param_attribute_list(param_node, param, supported_file_formats):
982		param_node.attrib["name"] = get_galaxy_parameter_name(param)
983
984		param_type = TYPE_TO_GALAXY_TYPE[param.type]
985		if param_type is None:
986		raise ModelError("Unrecognized parameter type %(type)s for parameter %(name)s"
987		% {"type": param.type, "name": param.name})
988
989		if param.is_list:
990		param_type = "text"
991
992		if is_selection_parameter(param):
993		param_type = "select"
994		if len(param.restrictions.choices) < 5:
995		param_node.attrib["display"] = "radio"
996
997		if is_boolean_parameter(param):
998		param_type = "boolean"
999
1000		if param.type is _InFile:
1001		# assume it's just text unless restrictions are provided
1002		param_format = "txt"
1003		if param.restrictions is not None:
1004		# join all formats of the file, take mapping from supported_file if available for an entry
1005		if type(param.restrictions) is _FileFormat:
1006		param_format = ','.join([get_supported_file_type(i, supported_file_formats) if
1007		get_supported_file_type(i, supported_file_formats)
1008		else i for i in param.restrictions.formats])
1009		else:
1010		raise InvalidModelException("Expected 'file type' restrictions for input file [%(name)s], "
1011		"but instead got [%(type)s]"
1012		% {"name": param.name, "type": type(param.restrictions)})
1013
1014		param_node.attrib["type"] = "data"
1015		param_node.attrib["format"] = param_format
1016		# in the case of multiple input set multiple flag
1017		if param.is_list:
1018		param_node.attrib["multiple"] = "true"
1019
1020		else:
1021		param_node.attrib["type"] = param_type
1022
1023		# check for parameters with restricted values (which will correspond to a "select" in galaxy)
1024		if param.restrictions is not None:
1025		# it could be either _Choices or _NumericRange, with special case for boolean types
1026		if param_type == "boolean":
1027		create_boolean_parameter(param_node, param)
1028		elif type(param.restrictions) is _Choices:
1029		# create as many <option> elements as restriction values
1030		for choice in param.restrictions.choices:
1031		option_node = add_child_node(param_node, "option", OrderedDict([("value", str(choice))]))
1032		option_node.text = str(choice)
1033
1034		# preselect the default value
1035		if param.default == choice:
1036		option_node.attrib["selected"] = "true"
1037
1038		elif type(param.restrictions) is _NumericRange:
1039		if param.type is not int and param.type is not float:
1040		raise InvalidModelException("Expected either 'int' or 'float' in the numeric range restriction for "
1041		"parameter [%(name)s], but instead got [%(type)s]" %
1042		{"name": param.name, "type": type(param.restrictions)})
1043		# extract the min and max values and add them as attributes
1044		# validate the provided min and max values
1045		if param.restrictions.n_min is not None:
1046		param_node.attrib["min"] = str(param.restrictions.n_min)
1047		if param.restrictions.n_max is not None:
1048		param_node.attrib["max"] = str(param.restrictions.n_max)
1049		elif type(param.restrictions) is _FileFormat:
1050		param_node.attrib["format"] = ','.join([get_supported_file_type(i, supported_file_formats) if
1051		get_supported_file_type(i, supported_file_formats)
1052		else i for i in param.restrictions.formats])
1053		else:
1054		raise InvalidModelException("Unrecognized restriction type [%(type)s] for parameter [%(name)s]"
1055		% {"type": type(param.restrictions), "name": param.name})
1056
1057		if param_type == "select" and param.default in param.restrictions.choices:
1058		param_node.attrib["optional"] = "False"
1059		else:
1060		param_node.attrib["optional"] = str(not param.required)
1061
1062		if param_type == "text":
1063		# add size attribute... this is the length of a textbox field in Galaxy (it could also be 15x2, for instance)
1064		param_node.attrib["size"] = "30"
1065		# add sanitizer nodes, this is needed for special character like "["
1066		# which are used for example by FeatureFinderMultiplex
1067		sanitizer_node = SubElement(param_node, "sanitizer")
1068
1069		valid_node = SubElement(sanitizer_node, "valid", OrderedDict([("initial", "string.printable")]))
1070		add_child_node(valid_node, "remove", OrderedDict([("value", '\'')]))
1071		add_child_node(valid_node, "remove", OrderedDict([("value", '"')]))
1072
1073		# check for default value
1074		if param.default is not None and param.default is not _Null:
1075		if type(param.default) is list:
1076		# we ASSUME that a list of parameters looks like:
1077		# $ tool -ignore He Ar Xe
1078		# meaning, that, for example, Helium, Argon and Xenon will be ignored
1079		param_node.attrib["value"] = ' '.join(map(str, param.default))
1080
1081		elif param_type != "boolean":
1082		param_node.attrib["value"] = str(param.default)
1083
1084		else:
1085		# simple boolean with a default
1086		if param.default is True:
1087		param_node.attrib["checked"] = "true"
1088		else:
1089		if param.type is int or param.type is float:
1090		# galaxy requires "value" to be included for int/float
1091		# since no default was included, we need to figure out one in a clever way... but let the user know
1092		# that we are "thinking" for him/her
1093		warning("Generating default value for parameter [%s]. "
1094		"Galaxy requires the attribute 'value' to be set for integer/floats. "
1095		"Edit the CTD file and provide a suitable default value." % param.name, 1)
1096		# check if there's a min/max and try to use them
1097		default_value = None
1098		if param.restrictions is not None:
1099		if type(param.restrictions) is _NumericRange:
1100		default_value = param.restrictions.n_min
1101		if default_value is None:
1102		default_value = param.restrictions.n_max
1103		if default_value is None:
1104		# no min/max provided... just use 0 and see what happens
1105		default_value = 0
1106		else:
1107		# should never be here, since we have validated this anyway...
1108		# this code is here just for documentation purposes
1109		# however, better safe than sorry!
1110		# (it could be that the code changes and then we have an ugly scenario)
1111		raise InvalidModelException("Expected either a numeric range for parameter [%(name)s], "
1112		"but instead got [%(type)s]"
1113		% {"name": param.name, "type": type(param.restrictions)})
1114		else:
1115		# no restrictions and no default value provided...
1116		# make up something
1117		default_value = 0
1118		param_node.attrib["value"] = str(default_value)
1119
1120		label = "%s parameter" % param.name
1121		help_text = ""
1122
1123		if param.description is not None:
1124		label, help_text = generate_label_and_help(param.description)
1125
1126		param_node.attrib["label"] = label
1127		param_node.attrib["help"] = "(-%s)" % param.name + " " + help_text
1128
1129
1130		def generate_label_and_help(desc):
1131		label = ""
1132		help_text = ""
1133		# This tag is found in some descriptions
1134		if not isinstance(desc, basestring):
1135		desc = str(desc)
1136		desc = desc.encode("utf8").replace("#br#", " <br>")
1137		# Get rid of dots in the end
1138		if desc.endswith("."):
1139		desc = desc.rstrip(".")
1140		# Check if first word is a normal word and make it uppercase
1141		if str(desc).find(" ") > -1:
1142		first_word, rest = str(desc).split(" ", 1)
1143		if str(first_word).islower():
1144		# check if label has a quotient of the form a/b
1145		if first_word.find("/") != 1 :
1146		first_word.capitalize()
1147		desc = first_word + " " + rest
1148		label = desc.decode("utf8")
1149
1150		# Try to split the label if it is too long
1151		if len(desc) > 50:
1152		# find an example and put everything before in the label and the e.g. in the help
1153		if desc.find("e.g.") > 1 :
1154		label, help_text = desc.split("e.g.",1)
1155		help_text = "e.g." + help_text
1156		else:
1157		# find the end of the first sentence
1158		# look for ". " because some labels contain .file or something similar
1159		delimiter = ""
1160		if desc.find(". ") > 1 and desc.find("? ") > 1:
1161		if desc.find(". ") < desc.find("? "):
1162		delimiter = ". "
1163		else:
1164		delimiter = "? "
1165		elif desc.find(". ") > 1:
1166		delimiter = ". "
1167		elif desc.find("? ") > 1:
1168		delimiter = "? "
1169		if delimiter != "":
1170		label, help_text = desc.split(delimiter, 1)
1171
1172		# add the question mark back
1173		if delimiter == "? ":
1174		label += "? "
1175
1176		# remove all linebreaks
1177		label = label.rstrip().rstrip('<br>').rstrip()
1178		return label, help_text
1179
1180
1181		def get_indented_text(text, indentation_level):
1182		return ("%(indentation)s%(text)s" %
1183		{"indentation": " " * (MESSAGE_INDENTATION_INCREMENT * indentation_level),
1184		"text": text})
1185
1186
1187		def warning(warning_text, indentation_level):
1188		sys.stdout.write(get_indented_text("WARNING: %s\n" % warning_text, indentation_level))
1189
1190
1191		def error(error_text, indentation_level):
1192		sys.stderr.write(get_indented_text("ERROR: %s\n" % error_text, indentation_level))
1193
1194
1195		def info(info_text, indentation_level):
1196		sys.stdout.write(get_indented_text("INFO: %s\n" % info_text, indentation_level))
1197
1198
1199		# determines if the given choices are boolean (basically, if the possible values are yes/no, true/false)
1200		def is_boolean_parameter(param):
1201		## detect boolean selects of OpenMS
1202		if is_selection_parameter(param):
1203		if len(param.restrictions.choices) == 2:
1204		# check that default value is false to make sure it is an actual flag
1205		if "false" in param.restrictions.choices and \
1206		"true" in param.restrictions.choices and \
1207		param.default == "false":
1208		return True
1209		else:
1210		return param.type is bool
1211
1212
1213		# determines if there are choices for the parameter
1214		def is_selection_parameter(param):
1215		return type(param.restrictions) is _Choices
1216
1217
1218		def get_lowercase_list(some_list):
1219		lowercase_list = map(str, some_list)
1220		lowercase_list = map(string.lower, lowercase_list)
1221		lowercase_list = map(strip, lowercase_list)
1222		return lowercase_list
1223
1224
1225		# creates a galaxy boolean parameter type
1226		# this method assumes that param has restrictions, and that only two restictions are present
1227		# (either yes/no or true/false)
1228		def create_boolean_parameter(param_node, param):
1229		# first, determine the 'truevalue' and the 'falsevalue'
1230		"""TODO: true and false values can be way more than 'true' and 'false'
1231		but for that we need CTD support
1232		"""
1233		# by default, 'true' and 'false' are handled as flags, like the verbose flag (i.e., -v)
1234		true_value = "-%s" % get_param_name(param)
1235		false_value = ""
1236		choices = get_lowercase_list(param.restrictions.choices)
1237		if "yes" in choices:
1238		true_value = "yes"
1239		false_value = "no"
1240		param_node.attrib["truevalue"] = true_value
1241		param_node.attrib["falsevalue"] = false_value
1242
1243		# set the checked attribute
1244		if param.default is not None:
1245		checked_value = "false"
1246		default = strip(string.lower(param.default))
1247		if default == "yes" or default == "true":
1248		checked_value = "true"
1249		#attribute_list["checked"] = checked_value
1250		param_node.attrib["checked"] = checked_value
1251
1252
1253		def create_outputs(parent, model, **kwargs):
1254		outputs_node = add_child_node(parent, "outputs")
1255		parameter_hardcoder = kwargs["parameter_hardcoder"]
1256
1257		for param in extract_parameters(model):
1258
1259		# no need to show hardcoded parameters
1260		hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name)
1261		if param.name in kwargs["blacklisted_parameters"] or hardcoded_value:
1262		# let's not use an extra level of indentation and use NOP
1263		continue
1264		if param.type is _OutFile:
1265		create_output_node(outputs_node, param, model, kwargs["supported_file_formats"])
1266
1267		# If there are no outputs defined in the ctd the node will have no children
1268		# and the stdout will be used as output
1269		if len(outputs_node) == 0:
1270		add_child_node(outputs_node, "data",
1271		OrderedDict([("name", "param_stdout"), ("format", "txt"), ("label", "Output from stdout")]))
1272
1273
1274		def create_output_node(parent, param, model, supported_file_formats):
1275		data_node = add_child_node(parent, "data")
1276		data_node.attrib["name"] = get_galaxy_parameter_name(param)
1277
1278		data_format = "data"
1279		if param.restrictions is not None:
1280		if type(param.restrictions) is _FileFormat:
1281		# set the first data output node to the first file format
1282
1283		# check if there are formats that have not been registered yet...
1284		output = list()
1285		for format_name in param.restrictions.formats:
1286		if not format_name in supported_file_formats.keys():
1287		output.append(str(format_name))
1288
1289		# warn only if there's about to complain
1290		if output:
1291		warning("Parameter " + param.name + " has the following unsupported format(s):" + ','.join(output), 1)
1292		data_format = ','.join(output)
1293
1294		formats = get_supported_file_types(param.restrictions.formats, supported_file_formats)
1295		try:
1296		data_format = formats.pop()
1297		except KeyError:
1298		# there is not much we can do, other than catching the exception
1299		pass
1300		# if there are more than one output file formats try to take the format from the input parameter
1301		if formats:
1302		corresponding_input = get_input_with_same_restrictions(param, model, supported_file_formats)
1303		if corresponding_input is not None:
1304		data_format = "input"
1305		data_node.attrib["metadata_source"] = get_galaxy_parameter_name(corresponding_input)
1306		else:
1307		raise InvalidModelException("Unrecognized restriction type [%(type)s] "
1308		"for output [%(name)s]" % {"type": type(param.restrictions),
1309		"name": param.name})
1310		data_node.attrib["format"] = data_format
1311
1312		#TODO: find a smarter label ?
1313		#if param.description is not None:
1314		# data_node.setAttribute("label", param.description)
1315		return data_node
1316
1317
1318		# Get the supported file format for one given format
1319		def get_supported_file_type(format_name, supported_file_formats):
1320		if format_name in supported_file_formats.keys():
1321		return supported_file_formats.get(format_name, DataType(format_name, format_name)).galaxy_extension
1322		else:
1323		return None
1324
1325
1326		def get_supported_file_types(formats, supported_file_formats):
1327		return set([supported_file_formats.get(format_name, DataType(format_name, format_name)).galaxy_extension
1328		for format_name in formats if format_name in supported_file_formats.keys()])
1329
1330
1331		def create_change_format_node(parent, data_formats, input_ref):
1332		# <change_format>
1333		# <when input="secondary_structure" value="true" format="txt"/>
1334		# </change_format>
1335		change_format_node = add_child_node(parent, "change_format")
1336		for data_format in data_formats:
1337		add_child_node(change_format_node, "when",
1338		OrderedDict([("input", input_ref), ("value", data_format), ("format", data_format)]))
1339
1340
1341		# Shows basic information about the file, such as data ranges and file type.
1342		def create_help(tool, model):
1343		manual = ''
1344		doc_url = None
1345		if 'manual' in model.opt_attribs.keys():
1346		manual += '%s\n\n' % model.opt_attribs["manual"]
1347		if 'docurl' in model.opt_attribs.keys():
1348		doc_url = model.opt_attribs["docurl"]
1349
1350		help_text = "No help available"
1351		if manual is not None:
1352		help_text = manual
1353		if doc_url is not None:
1354		help_text = ("" if manual is None else manual) + "\nFor more information, visit %s" % doc_url
1355		help_node = add_child_node(tool, "help")
1356		# TODO: do we need CDATA Section here?
1357		help_node.text = help_text
1358
1359
1360		# since a model might contain several ParameterGroup elements,
1361		# we want to simply 'flatten' the parameters to generate the Galaxy wrapper
1362		def extract_parameters(model):
1363		parameters = []
1364		if len(model.parameters.parameters) > 0:
1365		# use this to put parameters that are to be processed
1366		# we know that CTDModel has one parent ParameterGroup
1367		pending = [model.parameters]
1368		while len(pending) > 0:
1369		# take one element from 'pending'
1370		parameter = pending.pop()
1371		if type(parameter) is not ParameterGroup:
1372		parameters.append(parameter)
1373		else:
1374		# append the first-level children of this ParameterGroup
1375		pending.extend(parameter.parameters.values())
1376		# returned the reversed list of parameters (as it is now,
1377		# we have the last parameter in the CTD as first in the list)
1378		return reversed(parameters)
1379
1380
1381		# adds and returns a child node using the given name to the given parent node
1382		def add_child_node(parent_node, child_node_name, attributes=OrderedDict([])):
1383		child_node = SubElement(parent_node, child_node_name, attributes)
1384		return child_node
1385
1386
1387		if __name__ == "__main__":
1388		sys.exit(main())