First kind of working version.
Luis de la Garza
6 years ago
8 | 8 | - `pyyaml` |
9 | 9 | |
10 | 10 | ### Installing Dependencies |
11 | The easiest way is to install all required dependencies using `conda`, like so: | |
11 | We recommend the use of `conda` to manage all dependencies. If you're not sure what `conda` is, make sure to read the [using-conda](conda documentation). | |
12 | ||
13 | The easiest way to get you started with CTD conversion is to create a `conda` environment on which you'll install all dependencies. Using environments in `conda` allows you to have parallel, independent python environments, thus avoiding conflicts between libraries. If you haven't installed `conda`, check [conda-install](conda's installation guide). | |
14 | ||
15 | Once you've installed `conda`, create an environment named `ctd-converter` and list all dependencies, like so: | |
12 | 16 | |
13 | 17 | ```sh |
14 | $ conda install lxml pyyaml | |
15 | $ conda install -c workflowconversion ctdopts | |
18 | $ conda create --name ctd-converter --channel workflowconversion ctdopts lxml pyyaml libxml2=2.9.2 | |
16 | 19 | ``` |
17 | 20 | |
18 | Note that [CTDopts] is a python module available on the `workflowconversion` channel. Of course, you can just download [CTDopts] and make it available through your `PYTHONPATH` environment variable. To get more information about how to install python modules, visit: https://docs.python.org/2/install/. | |
21 | [CTDopts] is a python module available on the `workflowconversion` channel in the Anaconda cloud. Of course, you could just download [CTDopts] and make it available through your `PYTHONPATH` environment variable, if you're into that. To get more information about how to install python modules, visit: https://docs.python.org/2/install/. | |
19 | 22 | |
20 | ### Issues with `libxml2` and Schema Validation | |
21 | 23 | `lxml` depends on `libxml2`. When you install `lxml` you'll get the latest version of `libxml2` (2.9.4) by default. You would usually want the latest version, but there is, however, a bug in validating XML files against a schema in this version of `libxml2`. |
22 | 24 | |
23 | If you require validation of input CTDs against a schema (which we recommend), you will need to downgrade to the latest known version of `libxml2` that works, namely, 2.9.2. You can do this by executing the following command **after** you've installed all other dependencies: | |
25 | If you require validation of input CTDs against a schema (which we recommend), you will need to downgrade to the latest known version of `libxml2` that works, namely, 2.9.2. | |
26 | ||
27 | You will now need to *activate* the environment by executing the following command: | |
24 | 28 | |
25 | 29 | ```sh |
26 | $ conda install -y libxml2=2.9.2 | |
30 | $ source activate ctd-converter | |
27 | 31 | ``` |
28 | 32 | |
29 | The `-y` flag tells `conda` to perform the installation without confirmation. You will be warned that this command will downgrade some packages, which is fine, don't worry. | |
30 | ||
31 | 33 | ## How to install `CTDConverter` |
32 | `CTDConverter` is not a python module, rather, a series of scripts, so installing it is as easy as downloading the source code from https://github.com/genericworkflownodes/CTDConverter. | |
34 | `CTDConverter` is not a python module, rather, a series of scripts, so installing it is as easy as downloading the source code from https://github.com/genericworkflownodes/CTDConverter. Once you've installed all dependencies, downloaded `CTDConverter` and activated your `conda` environment, you're good to go. | |
33 | 35 | |
34 | 36 | ## Usage |
35 | 37 | The first thing that you need to tell `CTDConverter` is the output format of the converted wrappers. `CTDConverter` supports conversion of CTDs into Galaxy and CWL. Invoking it is as simple as follows: |
164 | 166 | |
165 | 167 | |
166 | 168 | [CTDopts]: https://github.com/genericworkflownodes/CTDopts |
167 | [CTDSchema]: https://github.com/WorkflowConversion/CTDSchema⏎ | |
169 | [CTDSchema]: https://github.com/WorkflowConversion/CTDSchema | |
170 | [conda-install]: https://conda.io/docs/install/quick.html | |
171 | [using-conda]: https://conda.io/docs/using/envs.html⏎ |
7 | 7 | from logger import info, error, warning |
8 | 8 | |
9 | 9 | from common.exceptions import ApplicationException |
10 | from CTDopts.CTDopts import CTDModel | |
10 | from CTDopts.CTDopts import CTDModel, ParameterGroup | |
11 | 11 | |
12 | 12 | |
13 | 13 | MESSAGE_INDENTATION_INCREMENT = 2 |
103 | 103 | except Exception, e: |
104 | 104 | error("Could not load validation schema %s. Reason: %s" % (xsd_location, str(e)), 0) |
105 | 105 | else: |
106 | info("Validation against a schema has not been enabled.", 0) | |
106 | warning("Validation against a schema has not been enabled.", 0) | |
107 | 107 | for input_ctd in input_ctds: |
108 | 108 | try: |
109 | 109 | if schema is not None: |
112 | 112 | # if multiple inputs are being converted, we need to generate a different output_file for each input |
113 | 113 | if is_converting_multiple_ctds: |
114 | 114 | output_file = os.path.join(output_file, |
115 | get_filename_without_suffix(input_ctd) + '.' + output_file_extension) | |
115 | get_filename_without_suffix(input_ctd) + "." + output_file_extension) | |
116 | info("Parsing %s" % input_ctd) | |
116 | 117 | parsed_ctds.append(ParsedCTD(CTDModel(from_file=input_ctd), input_ctd, output_file)) |
117 | 118 | except Exception, e: |
118 | 119 | error(str(e), 1) |
157 | 158 | # TODO: add verbosity, maybe? |
158 | 159 | program_version = "v%s" % version |
159 | 160 | program_build_date = str(last_updated) |
160 | program_version_message = '%%(prog)s %s (%s)' % (program_version, program_build_date) | |
161 | parser.add_argument("-v", "--version", action='version', version=program_version_message) | |
161 | program_version_message = "%%(prog)s %s (%s)" % (program_version, program_build_date) | |
162 | parser.add_argument("-v", "--version", action="version", version=program_version_message) | |
162 | 163 | |
163 | 164 | |
164 | 165 | def parse_hardcoded_parameters(hardcoded_parameters_file): |
191 | 192 | parameter_hardcoder.register_parameter(parameter_name, hardcoded_value) |
192 | 193 | |
193 | 194 | return parameter_hardcoder |
195 | ||
196 | ||
197 | def extract_tool_help_text(ctd_model): | |
198 | manual = "" | |
199 | doc_url = None | |
200 | if "manual" in ctd_model.opt_attribs.keys(): | |
201 | manual += "%s\n\n" % ctd_model.opt_attribs["manual"] | |
202 | if "docurl" in ctd_model.opt_attribs.keys(): | |
203 | doc_url = ctd_model.opt_attribs["docurl"] | |
204 | ||
205 | help_text = "No help available" | |
206 | if manual is not None: | |
207 | help_text = manual | |
208 | if doc_url is not None: | |
209 | help_text = ("" if manual is None else manual) + "\nFor more information, visit %s" % doc_url | |
210 | ||
211 | return help_text | |
212 | ||
213 | ||
214 | def extract_tool_executable_path(model, default_executable_path): | |
215 | # rules to build the executable path: | |
216 | # if executablePath is null, then use default_executable_path | |
217 | # if executablePath is null and executableName is null, then the name of the tool will be used | |
218 | # if executablePath is null and executableName is not null, then executableName will be used | |
219 | # if executablePath is not null and executableName is null, | |
220 | # then executablePath and the name of the tool will be used | |
221 | # if executablePath is not null and executableName is not null, then both will be used | |
222 | ||
223 | # first, check if the model has executablePath / executableName defined | |
224 | executable_path = model.opt_attribs.get("executablePath", None) | |
225 | executable_name = model.opt_attribs.get("executableName", None) | |
226 | ||
227 | # check if we need to use the default_executable_path | |
228 | if executable_path is None: | |
229 | executable_path = default_executable_path | |
230 | ||
231 | # fix the executablePath to make sure that there is a '/' in the end | |
232 | if executable_path is not None: | |
233 | executable_path = executable_path.strip() | |
234 | if not executable_path.endswith("/"): | |
235 | executable_path += "/" | |
236 | ||
237 | # assume that we have all information present | |
238 | command = str(executable_path) + str(executable_name) | |
239 | if executable_path is None: | |
240 | if executable_name is None: | |
241 | command = model.name | |
242 | else: | |
243 | command = executable_name | |
244 | else: | |
245 | if executable_name is None: | |
246 | command = executable_path + model.name | |
247 | return command | |
248 | ||
249 | ||
250 | def extract_and_flatten_parameters(ctd_model): | |
251 | parameters = [] | |
252 | if len(ctd_model.parameters.parameters) > 0: | |
253 | # use this to put parameters that are to be processed | |
254 | # we know that CTDModel has one parent ParameterGroup | |
255 | pending = [ctd_model.parameters] | |
256 | while len(pending) > 0: | |
257 | # take one element from 'pending' | |
258 | parameter = pending.pop() | |
259 | if type(parameter) is not ParameterGroup: | |
260 | parameters.append(parameter) | |
261 | else: | |
262 | # append the first-level children of this ParameterGroup | |
263 | pending.extend(parameter.parameters.values()) | |
264 | # returned the reversed list of parameters (as it is now, | |
265 | # we have the last parameter in the CTD as first in the list) | |
266 | return reversed(parameters) | |
267 | ||
268 | ||
269 | # some parameters are mapped to command line options, this method helps resolve those mappings, if any | |
270 | def resolve_param_mapping(param, ctd_model): | |
271 | # go through all mappings and find if the given param appears as a reference name in a mapping element | |
272 | param_mapping = None | |
273 | for cli_element in ctd_model.cli: | |
274 | for mapping_element in cli_element.mappings: | |
275 | if mapping_element.reference_name == param.name: | |
276 | if param_mapping is not None: | |
277 | warning("The parameter %s has more than one mapping in the <cli> section. " | |
278 | "The first found mapping, %s, will be used." % (param.name, param_mapping), 1) | |
279 | else: | |
280 | param_mapping = cli_element.option_identifier | |
281 | ||
282 | return param_mapping if param_mapping is not None else param.name | |
283 | ||
284 | ||
285 | def _extract_param_cli_name(param, ctd_model): | |
286 | # we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy) | |
287 | if type(param.parent) == ParameterGroup: | |
288 | if not hasattr(param.parent.parent, 'parent'): | |
289 | return resolve_param_mapping(param, ctd_model) | |
290 | elif not hasattr(param.parent.parent.parent, 'parent'): | |
291 | return resolve_param_mapping(param, ctd_model) | |
292 | else: | |
293 | if ctd_model.cli: | |
294 | warning("Using nested parameter sections (NODE elements) is not compatible with <cli>", 1) | |
295 | return extract_param_name(param.parent) + ":" + resolve_param_mapping(param, ctd_model) | |
296 | else: | |
297 | return resolve_param_mapping(param, ctd_model) | |
298 | ||
299 | ||
300 | def extract_param_name(param): | |
301 | # we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy) | |
302 | if type(param.parent) == ParameterGroup: | |
303 | if not hasattr(param.parent.parent, "parent"): | |
304 | return param.name | |
305 | elif not hasattr(param.parent.parent.parent, "parent"): | |
306 | return param.name | |
307 | else: | |
308 | return extract_param_name(param.parent) + ":" + param.name | |
309 | else: | |
310 | return param.name | |
311 | ||
312 | ||
313 | def extract_command_line_prefix(param, ctd_model): | |
314 | param_name = extract_param_name(param) | |
315 | param_cli_name = _extract_param_cli_name(param, ctd_model) | |
316 | if param_name == param_cli_name: | |
317 | # there was no mapping, so for the cli name we will use a '-' in the prefix | |
318 | param_cli_name = "-" + param_name | |
319 | return param_cli_name |
5 | 5 | from argparse import ArgumentParser |
6 | 6 | from argparse import RawDescriptionHelpFormatter |
7 | 7 | from common.exceptions import ApplicationException, ModelError |
8 | ||
9 | 8 | |
10 | 9 | __all__ = [] |
11 | 10 | __version__ = 2.0 |
173 | 172 | # at this point we cannot parse the arguments, because each converter takes different arguments, meaning each |
174 | 173 | # converter will register its own parameters after we've registered the basic ones... we have to do it old school |
175 | 174 | if len(argv) < 2: |
176 | utils.error('Not enough arguments provided') | |
177 | print('\nUsage: $ python convert.py [TARGET] [ARGUMENTS]\n\n' + | |
178 | 'Where:\n' + | |
179 | ' target: one of \'cwl\' or \'galaxy\'\n\n' + | |
180 | 'Run again using the -h/--help option to print more detailed help.\n') | |
175 | utils.error("Not enough arguments provided") | |
176 | print("\nUsage: $ python convert.py [TARGET] [ARGUMENTS]\n\n" + | |
177 | "Where:\n" + | |
178 | " target: one of 'cwl' or 'galaxy'\n\n" + | |
179 | "Run again using the -h/--help option to print more detailed help.\n") | |
181 | 180 | return 1 |
182 | 181 | |
183 | 182 | # TODO: at some point this should look like real software engineering and use a map containing converter instances |
192 | 191 | print(program_license) |
193 | 192 | return 0 |
194 | 193 | else: |
195 | utils.error('Unrecognized target engine. Supported targets are \'cwl\' and \'galaxy\'.') | |
196 | return 1 | |
194 | utils.error("Unrecognized target engine. Supported targets are 'cwl' and 'galaxy'.") | |
195 | return 1 | |
196 | ||
197 | utils.info("Using %s converter" % target) | |
197 | 198 | |
198 | 199 | try: |
199 | 200 | # Setup argument parser |
216 | 217 | return converter.convert_models(args, parsed_ctds) |
217 | 218 | |
218 | 219 | except KeyboardInterrupt: |
219 | # handle keyboard interrupt | |
220 | print("Interrupted...") | |
220 | 221 | return 0 |
221 | 222 | |
222 | 223 | except ApplicationException, e: |
224 | traceback.print_exc() | |
223 | 225 | utils.error("CTDConverter could not complete the requested operation.", 0) |
224 | 226 | utils.error("Reason: " + e.msg, 0) |
225 | 227 | return 1 |
226 | 228 | |
227 | 229 | except ModelError, e: |
230 | traceback.print_exc() | |
228 | 231 | utils.error("There seems to be a problem with one of your input CTDs.", 0) |
229 | 232 | utils.error("Reason: " + e.msg, 0) |
230 | 233 | return 1 |
231 | 234 | |
232 | 235 | except Exception, e: |
233 | 236 | traceback.print_exc() |
237 | utils.error("CTDConverter could not complete the requested operation.", 0) | |
238 | utils.error("Reason: " + e.msg, 0) | |
234 | 239 | return 2 |
235 | ||
236 | return 0 | |
237 | 240 | |
238 | 241 | |
239 | 242 | def validate_and_prepare_common_arguments(args): |
259 | 262 | for argument_name in input_arguments_to_check: |
260 | 263 | utils.validate_argument_is_valid_path(args, argument_name) |
261 | 264 | |
265 | # add the parameter hardcoder | |
266 | args.parameter_hardcoder = utils.parse_hardcoded_parameters(args.hardcoded_parameters) | |
267 | ||
262 | 268 | |
263 | 269 | if __name__ == "__main__": |
264 | 270 | sys.exit(main())⏎ |
0 | # Conversion of CTD Files to CWL | |
1 | ||
2 | ## How to use: Parameters in Detail | |
3 | The CWL converter has, for now, only the basic parameters described in the [top README file](../README.md). | |
4 |
0 | #!/usr/bin/env python | |
1 | # encoding: utf-8 | |
2 | ||
3 | # instead of using cwlgen, we decided to use PyYAML directly | |
4 | # we promptly found a problem with cwlgen, namely, it is not possible to construct something like: | |
5 | # some_paramter: | |
6 | # type: ['null', string] | |
7 | # which kind of sucks, because this seems to be the way to state that a parameter is truly optional and has no default | |
8 | # since cwlgen is just "fancy classes" around the yaml.dump() method, we implemented our own generation of yaml | |
9 | ||
10 | ||
11 | import yaml | |
12 | try: | |
13 | from yaml import CLoader as Loader | |
14 | except ImportError: | |
15 | from yaml import Loader | |
16 | ||
17 | from CTDopts.CTDopts import _InFile, _OutFile, ParameterGroup, _Choices, _NumericRange, _FileFormat, ModelError, _Null | |
18 | from common import utils, logger | |
19 | ||
20 | # all cwl-related properties are defined here | |
21 | ||
22 | CWL_SHEBANG = "#!/usr/bin/env cwl-runner" | |
23 | CURRENT_CWL_VERSION = 'v1.0' | |
24 | CWL_VERSION = 'cwlVersion' | |
25 | CLASS = 'class' | |
26 | BASE_COMMAND = 'baseCommand' | |
27 | INPUTS = 'inputs' | |
28 | ID = 'id' | |
29 | TYPE = 'type' | |
30 | INPUT_BINDING = 'inputBinding' | |
31 | OUTPUT_BINDING = 'outputBinding' | |
32 | PREFIX = 'prefix' | |
33 | OUTPUTS = 'outputs' | |
34 | VALUE_FROM = 'valueFrom' | |
35 | GLOB = 'glob' | |
36 | LABEL = 'label' | |
37 | DOC = 'doc' | |
38 | DEFAULT = 'default' | |
39 | ||
40 | # types | |
41 | TYPE_NULL = 'null' | |
42 | TYPE_BOOLEAN = 'boolean' | |
43 | TYPE_INT = 'int' | |
44 | TYPE_LONG = 'long' | |
45 | TYPE_FLOAT = 'float' | |
46 | TYPE_DOUBLE = 'double' | |
47 | TYPE_STRING = 'string' | |
48 | TYPE_FILE = 'File' | |
49 | TYPE_DIRECTORY = 'Directory' | |
50 | ||
51 | TYPE_TO_CWL_TYPE = {int: TYPE_INT, float: TYPE_DOUBLE, str: TYPE_STRING, bool: TYPE_BOOLEAN, _InFile: TYPE_FILE, | |
52 | _OutFile: TYPE_FILE, _Choices: TYPE_STRING} | |
53 | ||
54 | ||
55 | def add_specific_args(parser): | |
56 | # no specific arguments for CWL conversion, for now | |
57 | # however, this method has to be defined, otherwise ../convert.py won't work for CWL | |
58 | pass | |
59 | ||
60 | ||
61 | def get_preferred_file_extension(): | |
62 | return "cwl" | |
63 | ||
64 | ||
65 | def convert_models(args, parsed_ctds): | |
66 | # go through each ctd model and perform the conversion, easy as pie! | |
67 | for parsed_ctd in parsed_ctds: | |
68 | model = parsed_ctd.ctd_model | |
69 | origin_file = parsed_ctd.input_file | |
70 | output_file = parsed_ctd.suggested_output_file | |
71 | ||
72 | logger.info("Converting %s (source %s)" % (model.name, utils.get_filename(origin_file))) | |
73 | cwl_tool = convert_to_cwl(model, args) | |
74 | ||
75 | logger.info("Writing to %s" % utils.get_filename(output_file), 1) | |
76 | ||
77 | stream = file(output_file, 'w') | |
78 | stream.write(CWL_SHEBANG + '\n\n') | |
79 | yaml.dump(cwl_tool, stream, default_flow_style=True) | |
80 | stream.close() | |
81 | ||
82 | return 0 | |
83 | ||
84 | ||
85 | # returns a dictionary | |
86 | def convert_to_cwl(ctd_model, args): | |
87 | # create cwl_tool object with the basic information | |
88 | base_command = utils.extract_tool_executable_path(ctd_model, args.default_executable_path) | |
89 | ||
90 | # add basic properties | |
91 | cwl_tool = {} | |
92 | cwl_tool[CWL_VERSION] = CURRENT_CWL_VERSION | |
93 | cwl_tool[CLASS] = 'CommandLineTool' | |
94 | cwl_tool[LABEL] = ctd_model.opt_attribs["description"] | |
95 | cwl_tool[DOC] = utils.extract_tool_help_text(ctd_model) | |
96 | cwl_tool[BASE_COMMAND] = base_command | |
97 | ||
98 | # TODO: test with optional output files | |
99 | ||
100 | # add inputs/outputs | |
101 | for param in utils.extract_and_flatten_parameters(ctd_model): | |
102 | param_name = utils.extract_param_name(param) | |
103 | cwl_fixed_param_name = fix_param_name(param_name) | |
104 | hardcoded_value = args.parameter_hardcoder.get_hardcoded_value(param_name, ctd_model.name) | |
105 | param_default = str(param.default) if param.default is not _Null and param.default is not None else None | |
106 | ||
107 | if param.type is _OutFile: | |
108 | create_lists_if_missing(cwl_tool, [INPUTS, OUTPUTS]) | |
109 | # we know the only outputs are of type _OutFile | |
110 | # we need an input of type string that will contain the name of the output file | |
111 | input_binding = {} | |
112 | input_binding[PREFIX] = utils.extract_command_line_prefix(param, ctd_model) | |
113 | if hardcoded_value is not None: | |
114 | input_binding[VALUE_FROM] = hardcoded_value | |
115 | ||
116 | label = "Filename for %s output file" % param_name | |
117 | input_name_for_output_filename = get_input_name_for_output_filename(param) | |
118 | input_param = {} | |
119 | input_param[ID] = input_name_for_output_filename | |
120 | input_param[INPUT_BINDING] = input_binding | |
121 | input_param[DOC] = label | |
122 | input_param[LABEL] = label | |
123 | if param_default is not None: | |
124 | input_param[DEFAULT] = param_default | |
125 | input_param[TYPE] = generate_cwl_param_type(param, TYPE_STRING) | |
126 | ||
127 | output_binding = {} | |
128 | output_binding[GLOB] = "$(inputs.%s)" % input_name_for_output_filename | |
129 | ||
130 | output_param = {} | |
131 | output_param[ID] = cwl_fixed_param_name | |
132 | output_param[OUTPUT_BINDING] = output_binding | |
133 | output_param[DOC] = param.description | |
134 | output_param[LABEL] = param.description | |
135 | output_param[TYPE] = generate_cwl_param_type(param) | |
136 | ||
137 | cwl_tool[INPUTS].append(input_param) | |
138 | cwl_tool[OUTPUTS].append(output_param) | |
139 | ||
140 | else: | |
141 | create_lists_if_missing(cwl_tool, [INPUTS]) | |
142 | # we know that anything that is not an _OutFile is an input | |
143 | input_binding = {} | |
144 | input_binding[PREFIX] = utils.extract_command_line_prefix(param, ctd_model) | |
145 | if hardcoded_value is not None: | |
146 | input_binding[VALUE_FROM] = hardcoded_value | |
147 | ||
148 | input_param = {} | |
149 | input_param[ID] = cwl_fixed_param_name | |
150 | input_param[DOC] = param.description | |
151 | input_param[LABEL] = param.description | |
152 | if param_default is not None: | |
153 | input_param[DEFAULT] = param_default | |
154 | input_param[INPUT_BINDING] = input_binding | |
155 | input_param[TYPE] = generate_cwl_param_type(param) | |
156 | ||
157 | cwl_tool[INPUTS].append(input_param) | |
158 | ||
159 | return cwl_tool | |
160 | ||
161 | ||
162 | def create_lists_if_missing(cwl_tool, keys): | |
163 | for key in keys: | |
164 | if key not in cwl_tool: | |
165 | cwl_tool[key] = [] | |
166 | ||
167 | ||
168 | def get_input_name_for_output_filename(param): | |
169 | assert param.type is _OutFile, "Only output files can get a generated filename input parameter." | |
170 | return fix_param_name(utils.extract_param_name(param)) + "_filename" | |
171 | ||
172 | ||
173 | def fix_param_name(param_name): | |
174 | # IMPORTANT: there seems to be a problem in CWL if the prefix and the parameter name are the same, so we need to | |
175 | # prepend something to the parameter name that will be registered in CWL, also, using colons in parameter | |
176 | # names seems to bring all sorts of problems for cwl-runner | |
177 | return 'param_' + param_name.replace(":", "_") | |
178 | ||
179 | ||
180 | # in order to provide "true" optional params, the parameter type should be something like ['null', <CWLType>], | |
181 | # for instance ['null', int] | |
182 | def generate_cwl_param_type(param, forced_type=None): | |
183 | cwl_type = TYPE_TO_CWL_TYPE[param.type] if forced_type is None else forced_type | |
184 | return cwl_type if param.required else "['null', %s]" % cwl_type |
5 | 5 | * Taken values: The destination of the file. |
6 | 6 | |
7 | 7 | $ python convert.py galaxy -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -t /data/generated-galaxy-stubs/tool_conf.xml |
8 | ||
8 | 9 | |
9 | 10 | ## Adding Parameters to the Command-line |
10 | 11 | * Purpose: Galaxy *ToolConfig* files include a `<command>` element in which the command line to invoke the tool can be given. Sometimes it is needed to invoke your tools in a certain way (i.e., passing certain parameters). For instance, some tools offer the possibility to be invoked in a verbose or quiet way or even to be invoked in a headless way (i.e., without GUI). |
0 | 0 | #!/usr/bin/env python |
1 | 1 | # encoding: utf-8 |
2 | ||
3 | """ | |
4 | @author: delagarza | |
5 | """ | |
6 | ||
7 | 2 | import os |
8 | 3 | import string |
9 | 4 | |
14 | 9 | |
15 | 10 | from common import utils, logger |
16 | 11 | from common.exceptions import ApplicationException, InvalidModelException |
17 | from common.utils import ParsedCTD | |
18 | 12 | |
19 | 13 | from CTDopts.CTDopts import _InFile, _OutFile, ParameterGroup, _Choices, _NumericRange, _FileFormat, ModelError, _Null |
20 | 14 | |
69 | 63 | "Run with '-h' or '--help' to see a brief example on the format of this file.") |
70 | 64 | parser.add_argument("-m", "--macros", dest="macros_files", default=[], nargs="*", |
71 | 65 | action="append", required=None, help="Import the additional given file(s) as macros. " |
72 | "The macros stdio, requirements and advanced_options are required. Please see " | |
73 | "macros.xml for an example of a valid macros file. Al defined macros will be imported.") | |
74 | ||
75 | ||
76 | def convert_models(args, parsed_ctds): # IGNORE:C0111 | |
66 | "The macros stdio, requirements and advanced_options are " | |
67 | "required. Please see galaxy/macros.xml for an example of a " | |
68 | "valid macros file. All defined macros will be imported.") | |
69 | ||
70 | ||
71 | def convert_models(args, parsed_ctds): | |
77 | 72 | # validate and prepare the passed arguments |
78 | 73 | validate_and_prepare_args(args) |
79 | 74 | |
82 | 77 | |
83 | 78 | # parse the given supported file-formats file |
84 | 79 | supported_file_formats = parse_file_formats(args.formats_file) |
85 | ||
86 | # parse the hardcoded parameters file¬ | |
87 | parameter_hardcoder = utils.parse_hardcoded_parameters(args.hardcoded_parameters) | |
88 | 80 | |
89 | 81 | # parse the skip/required tools files |
90 | 82 | skip_tools = parse_tools_list_file(args.skip_tools_file) |
91 | 83 | required_tools = parse_tools_list_file(args.required_tools_file) |
92 | 84 | |
93 | 85 | _convert_internal(parsed_ctds, |
94 | supported_file_formats=supported_file_formats, | |
95 | default_executable_path=args.default_executable_path, | |
96 | add_to_command_line=args.add_to_command_line, | |
97 | blacklisted_parameters=args.blacklisted_parameters, | |
98 | required_tools=required_tools, | |
99 | skip_tools=skip_tools, | |
100 | macros_file_names=args.macros_files, | |
101 | macros_to_expand=macros_to_expand, | |
102 | parameter_hardcoder=parameter_hardcoder) | |
86 | supported_file_formats=supported_file_formats, | |
87 | default_executable_path=args.default_executable_path, | |
88 | add_to_command_line=args.add_to_command_line, | |
89 | blacklisted_parameters=args.blacklisted_parameters, | |
90 | required_tools=required_tools, | |
91 | skip_tools=skip_tools, | |
92 | macros_file_names=args.macros_files, | |
93 | macros_to_expand=macros_to_expand, | |
94 | parameter_hardcoder=args.parameter_hardcoder) | |
103 | 95 | |
104 | 96 | # generation of galaxy stubs is ready... now, let's see if we need to generate a tool_conf.xml |
105 | 97 | if args.tool_conf_destination is not None: |
248 | 240 | logger.info("Tool %s is not required, skipping it" % model.name, 0) |
249 | 241 | continue |
250 | 242 | else: |
251 | logger.info("Converting from %s " % origin_file, 0) | |
243 | logger.info("Converting %s (source %s)" % (model.name, utils.get_filename(origin_file)), 0) | |
252 | 244 | tool = create_tool(model) |
253 | 245 | write_header(tool, model) |
254 | 246 | create_description(tool, model) |
260 | 252 | |
261 | 253 | # wrap our tool element into a tree to be able to serialize it |
262 | 254 | tree = ElementTree(tool) |
255 | logger.info("Writing to %s" % utils.get_filename(output_file), 1) | |
263 | 256 | tree.write(open(output_file, 'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True) |
264 | 257 | |
265 | 258 | |
335 | 328 | description.text = model.opt_attribs["description"] |
336 | 329 | |
337 | 330 | |
338 | def get_param_cli_name(param, model): | |
339 | # we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy) | |
340 | if type(param.parent) == ParameterGroup: | |
341 | if not hasattr(param.parent.parent, 'parent'): | |
342 | return resolve_param_mapping(param, model) | |
343 | elif not hasattr(param.parent.parent.parent, 'parent'): | |
344 | return resolve_param_mapping(param, model) | |
345 | else: | |
346 | if model.cli: | |
347 | logger.warning("Using nested parameter sections (NODE elements) is not compatible with <cli>", 1) | |
348 | return get_param_name(param.parent) + ":" + resolve_param_mapping(param, model) | |
349 | else: | |
350 | return resolve_param_mapping(param, model) | |
351 | ||
352 | ||
353 | def get_param_name(param): | |
354 | # we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy) | |
355 | if type(param.parent) == ParameterGroup: | |
356 | if not hasattr(param.parent.parent, 'parent'): | |
357 | return param.name | |
358 | elif not hasattr(param.parent.parent.parent, 'parent'): | |
359 | return param.name | |
360 | else: | |
361 | return get_param_name(param.parent) + ":" + param.name | |
362 | else: | |
363 | return param.name | |
364 | ||
365 | ||
366 | # some parameters are mapped to command line options, this method helps resolve those mappings, if any | |
367 | def resolve_param_mapping(param, model): | |
368 | # go through all mappings and find if the given param appears as a reference name in a mapping element | |
369 | param_mapping = None | |
370 | for cli_element in model.cli: | |
371 | for mapping_element in cli_element.mappings: | |
372 | if mapping_element.reference_name == param.name: | |
373 | if param_mapping is not None: | |
374 | logger.warning("The parameter %s has more than one mapping in the <cli> section. " | |
375 | "The first found mapping, %s, will be used." % (param.name, param_mapping), 1) | |
376 | else: | |
377 | param_mapping = cli_element.option_identifier | |
378 | ||
379 | return param_mapping if param_mapping is not None else param.name | |
380 | ||
381 | ||
382 | 331 | def create_command(tool, model, **kwargs): |
383 | final_command = get_tool_executable_path(model, kwargs["default_executable_path"]) + '\n' | |
332 | final_command = utils.extract_tool_executable_path(model, kwargs["default_executable_path"]) + '\n' | |
384 | 333 | final_command += kwargs["add_to_command_line"] + '\n' |
385 | 334 | advanced_command_start = "#if $adv_opts.adv_opts_selector=='advanced':\n" |
386 | advanced_command_end = '#end if' | |
387 | advanced_command = '' | |
335 | advanced_command_end = "#end if" | |
336 | advanced_command = "" | |
388 | 337 | parameter_hardcoder = kwargs["parameter_hardcoder"] |
389 | 338 | |
390 | 339 | found_output_parameter = False |
391 | for param in extract_parameters(model): | |
340 | for param in utils.extract_and_flatten_parameters(model): | |
392 | 341 | if param.type is _OutFile: |
393 | 342 | found_output_parameter = True |
394 | command = '' | |
395 | param_name = get_param_name(param) | |
396 | param_cli_name = get_param_cli_name(param, model) | |
397 | if param_name == param_cli_name: | |
398 | # there was no mapping, so for the cli name we will use a '-' in the prefix | |
399 | param_cli_name = '-' + param_name | |
343 | command = "" | |
344 | param_name = utils.extract_param_name(param) | |
345 | command_line_prefix = utils.extract_command_line_prefix(param, model) | |
400 | 346 | |
401 | 347 | if param.name in kwargs["blacklisted_parameters"]: |
402 | 348 | continue |
403 | 349 | |
404 | 350 | hardcoded_value = parameter_hardcoder.get_hardcoded_value(param_name, model.name) |
405 | 351 | if hardcoded_value: |
406 | command += '%s %s\n' % (param_cli_name, hardcoded_value) | |
352 | command += "%s %s\n" % (command_line_prefix, hardcoded_value) | |
407 | 353 | else: |
408 | 354 | # parameter is neither blacklisted nor hardcoded... |
409 | 355 | galaxy_parameter_name = get_galaxy_parameter_name(param) |
412 | 358 | # logic for ITEMLISTs |
413 | 359 | if param.is_list: |
414 | 360 | if param.type is _InFile: |
415 | command += param_cli_name + "\n" | |
361 | command += command_line_prefix + "\n" | |
416 | 362 | command += " #for token in $" + galaxy_parameter_name + ":\n" |
417 | 363 | command += " $token\n" |
418 | 364 | command += " #end for\n" |
419 | 365 | else: |
420 | 366 | command += "\n#if $" + repeat_galaxy_parameter_name + ":\n" |
421 | command += param_cli_name + "\n" | |
367 | command += command_line_prefix + "\n" | |
422 | 368 | command += " #for token in $" + repeat_galaxy_parameter_name + ":\n" |
423 | 369 | command += " #if \" \" in str(token):\n" |
424 | 370 | command += " \"$token." + galaxy_parameter_name + "\"\n" |
438 | 384 | |
439 | 385 | if not is_boolean_parameter(param) and type(param.restrictions) is _Choices : |
440 | 386 | command += "#if " + actual_parameter + ":\n" |
441 | command += ' %s\n' % param_cli_name | |
387 | command += " %s\n" % command_line_prefix | |
442 | 388 | command += " #if \" \" in str(" + actual_parameter + "):\n" |
443 | 389 | command += " \"" + actual_parameter + "\"\n" |
444 | 390 | command += " #else\n" |
447 | 393 | command += "#end if\n" |
448 | 394 | elif is_boolean_parameter(param): |
449 | 395 | command += "#if " + actual_parameter + ":\n" |
450 | command += ' %s\n' % param_cli_name | |
396 | command += " %s\n" % command_line_prefix | |
451 | 397 | command += "#end if\n" |
452 | 398 | elif TYPE_TO_GALAXY_TYPE[param.type] is 'text': |
453 | 399 | command += "#if " + actual_parameter + ":\n" |
454 | command += " %s " % param_cli_name | |
400 | command += " %s " % command_line_prefix | |
455 | 401 | command += " \"" + actual_parameter + "\"\n" |
456 | 402 | command += "#end if\n" |
457 | 403 | else: |
458 | 404 | command += "#if " + actual_parameter + ":\n" |
459 | command += ' %s ' % param_cli_name | |
405 | command += " %s " % command_line_prefix | |
460 | 406 | command += actual_parameter + "\n" |
461 | 407 | command += "#end if\n" |
462 | 408 | |
481 | 427 | macros_node = add_child_node(tool, "macros") |
482 | 428 | token_node = add_child_node(macros_node, "token") |
483 | 429 | token_node.attrib["name"] = "@EXECUTABLE@" |
484 | token_node.text = get_tool_executable_path(model, kwargs["default_executable_path"]) | |
430 | token_node.text = utils.extract_tool_executable_path(model, kwargs["default_executable_path"]) | |
485 | 431 | |
486 | 432 | # add <import> nodes |
487 | 433 | for macro_file_name in kwargs["macros_file_names"]: |
496 | 442 | expand_node.attrib["macro"] = expand_macro |
497 | 443 | |
498 | 444 | |
499 | def get_tool_executable_path(model, default_executable_path): | |
500 | # rules to build the galaxy executable path: | |
501 | # if executablePath is null, then use default_executable_path and store it in executablePath | |
502 | # if executablePath is null and executableName is null, then the name of the tool will be used | |
503 | # if executablePath is null and executableName is not null, then executableName will be used | |
504 | # if executablePath is not null and executableName is null, | |
505 | # then executablePath and the name of the tool will be used | |
506 | # if executablePath is not null and executableName is not null, then both will be used | |
507 | ||
508 | # first, check if the model has executablePath / executableName defined | |
509 | executable_path = model.opt_attribs.get("executablePath", None) | |
510 | executable_name = model.opt_attribs.get("executableName", None) | |
511 | ||
512 | # check if we need to use the default_executable_path | |
513 | if executable_path is None: | |
514 | executable_path = default_executable_path | |
515 | ||
516 | # fix the executablePath to make sure that there is a '/' in the end | |
517 | if executable_path is not None: | |
518 | executable_path = executable_path.strip() | |
519 | if not executable_path.endswith('/'): | |
520 | executable_path += '/' | |
521 | ||
522 | # assume that we have all information present | |
523 | command = str(executable_path) + str(executable_name) | |
524 | if executable_path is None: | |
525 | if executable_name is None: | |
526 | command = model.name | |
527 | else: | |
528 | command = executable_name | |
529 | else: | |
530 | if executable_name is None: | |
531 | command = executable_path + model.name | |
532 | return command | |
533 | ||
534 | ||
535 | 445 | def get_galaxy_parameter_name(param): |
536 | return "param_%s" % get_param_name(param).replace(':', '_').replace('-', '_') | |
446 | return "param_%s" % utils.extract_param_name(param).replace(":", "_").replace("-", "_") | |
537 | 447 | |
538 | 448 | |
539 | 449 | def get_input_with_same_restrictions(out_param, model, supported_file_formats): |
540 | for param in extract_parameters(model): | |
450 | for param in utils.extract_and_flatten_parameters(model): | |
541 | 451 | if param.type is _InFile: |
542 | 452 | if param.restrictions is not None: |
543 | 453 | in_param_formats = get_supported_file_types(param.restrictions.formats, supported_file_formats) |
554 | 464 | parameter_hardcoder = kwargs["parameter_hardcoder"] |
555 | 465 | |
556 | 466 | # treat all non output-file parameters as inputs |
557 | for param in extract_parameters(model): | |
467 | for param in utils.extract_and_flatten_parameters(model): | |
558 | 468 | # no need to show hardcoded parameters |
559 | 469 | hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name) |
560 | 470 | if param.name in kwargs["blacklisted_parameters"] or hardcoded_value: |
568 | 478 | # something went wrong... we are handling an advanced parameter and the |
569 | 479 | # advanced input macro was not set... inform the user about it |
570 | 480 | logger.info("The parameter %s has been set as advanced, but advanced_input_macro has " |
571 | "not been set." % param.name, 1) | |
481 | "not been set." % param.name, 1) | |
572 | 482 | # there is not much we can do, other than use the inputs_node as a parent node! |
573 | 483 | parent_node = inputs_node |
574 | 484 | else: |
631 | 541 | if param.restrictions is not None: |
632 | 542 | # join all formats of the file, take mapping from supported_file if available for an entry |
633 | 543 | if type(param.restrictions) is _FileFormat: |
634 | param_format = ','.join([get_supported_file_type(i, supported_file_formats) if | |
635 | get_supported_file_type(i, supported_file_formats) | |
636 | else i for i in param.restrictions.formats]) | |
544 | param_format = ",".join([get_supported_file_type(i, supported_file_formats) if | |
545 | get_supported_file_type(i, supported_file_formats) | |
546 | else i for i in param.restrictions.formats]) | |
637 | 547 | else: |
638 | 548 | raise InvalidModelException("Expected 'file type' restrictions for input file [%(name)s], " |
639 | 549 | "but instead got [%(type)s]" |
719 | 629 | # since no default was included, we need to figure out one in a clever way... but let the user know |
720 | 630 | # that we are "thinking" for him/her |
721 | 631 | logger.warning("Generating default value for parameter [%s]. " |
722 | "Galaxy requires the attribute 'value' to be set for integer/floats. " | |
723 | "Edit the CTD file and provide a suitable default value." % param.name, 1) | |
632 | "Galaxy requires the attribute 'value' to be set for integer/floats. " | |
633 | "Edit the CTD file and provide a suitable default value." % param.name, 1) | |
724 | 634 | # check if there's a min/max and try to use them |
725 | 635 | default_value = None |
726 | 636 | if param.restrictions is not None: |
840 | 750 | but for that we need CTD support |
841 | 751 | """ |
842 | 752 | # by default, 'true' and 'false' are handled as flags, like the verbose flag (i.e., -v) |
843 | true_value = "-%s" % get_param_name(param) | |
753 | true_value = "-%s" % utils.extract_param_name(param) | |
844 | 754 | false_value = "" |
845 | 755 | choices = get_lowercase_list(param.restrictions.choices) |
846 | 756 | if "yes" in choices: |
862 | 772 | outputs_node = add_child_node(parent, "outputs") |
863 | 773 | parameter_hardcoder = kwargs["parameter_hardcoder"] |
864 | 774 | |
865 | for param in extract_parameters(model): | |
775 | for param in utils.extract_and_flatten_parameters(model): | |
866 | 776 | |
867 | 777 | # no need to show hardcoded parameters |
868 | 778 | hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name) |
947 | 857 | |
948 | 858 | # Shows basic information about the file, such as data ranges and file type. |
949 | 859 | def create_help(tool, model): |
950 | manual = '' | |
951 | doc_url = None | |
952 | if 'manual' in model.opt_attribs.keys(): | |
953 | manual += '%s\n\n' % model.opt_attribs["manual"] | |
954 | if 'docurl' in model.opt_attribs.keys(): | |
955 | doc_url = model.opt_attribs["docurl"] | |
956 | ||
957 | help_text = "No help available" | |
958 | if manual is not None: | |
959 | help_text = manual | |
960 | if doc_url is not None: | |
961 | help_text = ("" if manual is None else manual) + "\nFor more information, visit %s" % doc_url | |
962 | 860 | help_node = add_child_node(tool, "help") |
963 | 861 | # TODO: do we need CDATA Section here? |
964 | help_node.text = help_text | |
965 | ||
966 | ||
967 | # since a model might contain several ParameterGroup elements, | |
968 | # we want to simply 'flatten' the parameters to generate the Galaxy wrapper | |
969 | def extract_parameters(model): | |
970 | parameters = [] | |
971 | if len(model.parameters.parameters) > 0: | |
972 | # use this to put parameters that are to be processed | |
973 | # we know that CTDModel has one parent ParameterGroup | |
974 | pending = [model.parameters] | |
975 | while len(pending) > 0: | |
976 | # take one element from 'pending' | |
977 | parameter = pending.pop() | |
978 | if type(parameter) is not ParameterGroup: | |
979 | parameters.append(parameter) | |
980 | else: | |
981 | # append the first-level children of this ParameterGroup | |
982 | pending.extend(parameter.parameters.values()) | |
983 | # returned the reversed list of parameters (as it is now, | |
984 | # we have the last parameter in the CTD as first in the list) | |
985 | return reversed(parameters) | |
862 | help_node.text = utils.extract_tool_help_text(model) | |
986 | 863 | |
987 | 864 | |
988 | 865 | # adds and returns a child node using the given name to the given parent node |