Package list ctdconverter / dc098b9
Each supported format gets a separate folder Luis de la Garza 4 years ago
12 changed file(s) with 1799 addition(s) and 1786 deletion(s). Raw diff Collapse all Expand all
0 # CTD2Galaxy
0 # CTDConverter
11
2
3 Given one or more CTD files, `CTD2Galaxy` generates the needed Galaxy wrappers to include them in a Galaxy instance.
2 Given one or more CTD files, `CTD2Converter` generates the needed wrappers to include them in workflow engines, such as Galaxy and CWL.
43
54 ## Dependencies
65
7 `CTD2Galaxy` has the following python dependencies:
6 `CTDConverter` relies on [CTDopts]. The dependencies of each of the converters are as follows:
87
9 1. `lxml`.
10 1. [CTDopts]
8 ### Galaxy Converter
119
10 - Generation of Galaxy ToolConfig files relies on `lxml` to generate nice-looking XML files.
11
12 ## Installing Dependencies
1213 You can install the [CTDopts] and `lxml` modules via `conda`, like so:
1314
1415 ```sh
2122 Of course, you can just download [CTDopts] and make it available through your `PYTHONPATH` environment variable. To get more information about how to install python modules, visit: https://docs.python.org/2/install/.
2223
2324
24 ## How to install CTD2Galaxy
25 ## How to install CTDConverter
2526
26 1. Download the source code from https://github.com/genericworkflownodes/CTD2Galaxy.
27 1. Download the source code from https://github.com/genericworkflownodes/CTDConverter.
2728
28 ## How to use: most common tasks
29 ## Usage
2930
30 The generator takes several parameters and a varying number of inputs and outputs. The following sub-sections show how to perform the most common operations.
31 Check the detailed documentation for each of the converters:
3132
32 Running the generator with the `-h/--help` parameter will print extended information about each of the parameters.
33 - [Generation of Galaxy ToolConfig files](galaxy/README.md)
3334
34 ### Macros
35
36 Galaxy supports the use of macros via a `macros.xml` file (`CTD2Galaxy` provides a sample macros file in [macros.xml]). Instead of repeating sections, macros can be used and expanded. If you want fine control over the macros, you can use the `-m` / `--macros` parameter to provide your own macros file.
37
38 Please note that the used macros file **must** be copied to your Galaxy installation on the same location in which you place the generated *ToolConfig* files, otherwise Galaxy will not be able to parse the generated *ToolConfig* files!
39
40 ### One input, one output
41
42 In its simplest form, `CTD2Galaxy` takes an input CTD file and generates an output Galaxy *ToolConfig* file. The following use of `CTD2Galaxy`:
43
44 $ python generator.py -i /data/sample_input.ctd -o /data/sample_output.xml
45
46 will parse `/data/sample_input.ctd` and generate a Galaxy tool wrapper under `/data/sample_output.xml`. The generated file can be added to your Galaxy instance like any other tool.
47
48 ### Converting several CTDs at once
49
50 When converting several CTDs, the expected value for the `-o`/`--output` parameter is a folder. For example:
51
52 $ python generator.py -i /data/ctds/one.ctd /data/ctds/two.ctd -o /data/generated-galaxy-stubs
53
54 Will convert `/data/ctds/one.ctd` into `/data/generated-galaxy-stubs/one.xml` and `/data/ctds/two.ctd` into `/data/generated-galaxy-stubs/two.xml`.
55
56 You can use wildcard expansion, as supported by most modern operating systems:
57
58 $ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs
59
60 ### Generating a tool_conf.xml file
61
62 The generator supports generation of a `tool_conf.xml` file which you can later use in your local Galaxy installation. The parameter `-t`/`--tool-conf-destination` contains the path of a file in which a `tool_conf.xml` file will be generated.
63
64 $ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -t /data/generated-galaxy-stubs/tool_conf.xml
65
66
67 ## How to use: parameters in detail
68
69 ### A word about parameters taking list of values
70
71 All parameters have a short and a long option and some parameters take list of values. Using either the long or the short option of the parameter will produce the same output. The following examples show how to pass values using the `-f` / `--foo` parameter:
72
73 The following uses of the parameter will pass the list of values containing `bar`, `blah` and `blu`:
74
75 -f bar blah blu
76 --foo bar blah blu
77 -f bar -f blah -f blu
78 --foo bar --foo blah --foo blu
79 -f bar --foo blah blu
80
81 The following uses of the parameter will pass a single value `bar`:
82
83 -f bar
84 --foo bar
85
86 ### Schema validation
87
88 * Purpose: Provide validation of input CTDs against a schema file (i.e, a XSD file).
89 * Short/long version: `v` / `--validation-schema`
90 * Required: no.
91 * Taken values: location of the schema file (e.g., CTD.xsd).
92
93 CTDs can be validated against a schema. The master version of the schema can be found under [CTDSchema].
94
95 If a schema is provided, all input CTDs will be validated against it.
96
97 ### Input file(s)
98
99 * Purpose: Provide input CTD file(s) to convert.
100 * Short/long version: `-i` / `--input`
101 * Required: yes.
102 * Taken values: a list of input CTD files.
103
104 Example:
105
106 Any of the following invocations will convert `/data/input_one.ctd` and `/data/input_two.ctd`:
107
108 $ python generator.py -i /data/input_one.ctd -i /data/input_two.ctd -o /data/generated
109 $ python generator.py -i /data/input_one.ctd /data/input_two.ctd -o /data/generated
110 $ python generator.py --input /data/input_one.ctd /data/input_two.ctd -o /data/generated
111 $ python generator.py --input /data/input_one.ctd --input /data/input_two.ctd -o /data/generated
112
113 The following invocation will convert `/data/input.ctd` into `/data/output.xml`:
114
115 $ python generator.py -i /data/input.ctd -o /data/output.xml -m sample_files/macros.xml
116
117 Of course, you can also use wildcards, which will be automatically expanded by any modern operating system. This is extremely useful if you want to convert several files at a time. Imagine that the folder `/data/ctds` contains three files, `input_one.ctd`, `input_two.ctd` and `input_three.ctd`. The following two invocations will produce the same output in the `/data/galaxy`:
118
119 $ python generator.py -i /data/input_one.ctd /data/input_two.ctd /data/input_three.ctd -o /data/galaxy
120 $ python generator.py -i /data/*.ctd -o /data/galaxy
121
122 ### Finer control over the tools to be converted
123
124 Sometimes only a set of CTDs in a folder need to be converted. The parameter `-r`/`--required-tools` takes the path a file containing the names of tools that will be converted.
125
126 $ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -r required_tools.txt
127
128 On the other hand, if you want the generator to skip conversion of some CTDs, the parameter `-s`/`--skip-tools` will take the path of a file containing the names of tools that will not be converted.
129
130 $ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -s skipped_tools.txt
131
132 The format of these files (`required_tools.txt`, `skipped_tools.txt` in the examples above) is straightforward. Each line contains the name of a tool and any line starting with `#` will be ignored.
133
134 ### Output destination
135
136 * Purpose: Provide output destination for the generated Galaxy *ToolConfig* files.
137 * Short/long version: `-o` / `--output-destination`
138 * Required: yes.
139 * Taken values: if a single input file is given, then a single output file is expected. If multiple input files are given, then an existent folder, in which all generated Galaxy *ToolConfig* will be written, is expected.
140
141 Example:
142
143 A single input is given, and the output will be generated into `/data/output.xml`:
144
145 $ python generator.py -i /data/input.ctd -o /data/output.xml
146
147 Several inputs are given. The output is the already existent folder, `/data/stubs`, and at the end of the operation, the files `/data/stubs/input_one.ctd.xml` and `/data/stubs/input_two.ctd.xml` will be generated:
148
149 $ python generator.py -i /data/ctds/input_one.ctd /data/ctds/input_two.ctd -o /data/stubs
150
151
152 ### Adding parameters to the command line
153
154 * Purpose: Galaxy *ToolConfig* files include a `<command>` element in which the command line to invoke the tool can be given. Sometimes it is needed to invoke your tools in a certain way (i.e., passing certain parameters). For instance, some tools offer the possibility to be invoked in a verbose or quiet way or even to be invoked in a headless way (i.e., without GUI).
155 * Short/long version: `-a` / `--add-to-command-line`
156 * Required: no.
157 * Taken values: The command(s) to be added to the command line.
158
159 Example:
160
161 $ python generator.py ... -a "--quiet --no-gui"
162
163 Will generate the following `<command>` element in the generated Galaxy *ToolConfig*:
164
165 <command>TOOL_NAME --quiet --no-gui ...</command>
166
167
168 ### Blacklisting parameters
169
170 * Purpose: Some parameters present in the CTD are not to be exposed on Galaxy. Think of parameters such as `--help`, `--debug`, that might won't make much sense to be exposed to final users in a workflow management system such as Galaxy.
171 * Short/long version: `-b` / `--blacklist-parameters`
172 * Required: no.
173 * Taken values: A list of parameters to be blacklisted.
174
175 Example:
176
177 $ python generator.py ... -b h help quiet
178
179 Will not process any of the parameters named `h`, `help`, or `quiet` and will not appear in the generated Galaxy *ToolConfig*.
180
181 ### Generating a tool_conf.xml file
182
183 * Purpose: Galaxy uses a file `tool_conf.xml` in which other tools can be included. `CTD2Galaxy` can also generate this file. Categories will be extracted from the provided input CTDs and for each category, a different `<section>` will be generated. Any input CTD lacking a category will be sorted under the provided default category.
184 * Short/long version: `-t` / `--tool-conf-destination`
185 * Required: no.
186 * Taken values: The destination of the file.
187
188 ### Providing a default category
189
190 * Purpose: Input CTDs that lack a category will be sorted under the value given to this parameter. If this parameter is not given, then the category `DEFAULT` will be used.
191 * Short/long version: `-c` / `--default-category`
192 * Required: no.
193 * Taken values: The value for the default category to use for input CTDs lacking a category.
194
195 Example:
196
197 Suppose there is a folder containing several CTD files. Some of those CTDs don't have the optional attribute `category` and the rest belong to the `Data Processing` category. The following invocation:
198
199 $ python generator.py ... -c Other
200
201 will generate, for each of the categories, a different section. Additionally, CTDs lacking a category will be sorted under the given category, `Other`, as shown:
202
203 <section id="category-id-dataprocessing" name="Data Processing">
204 <tool file="some_path/tool_one.xml" />
205 <tool file="some_path/tool_two.xml" />
206 ...
207 </section>
208
209 <section id="category-id-other" name="Other">
210 <tool file="some_path/tool_three.xml" />
211 <tool file="some_path/tool_four.xml" />
212 ...
213 </section>
214
215 ### Providing a path for the location of the ToolConfig files
216
217 * Purpose: The `tool_conf.xml` file contains references to files which in turn contain Galaxy *ToolConfig* files. Using this parameter, you can provide information about the location of your tools.
218 * Short/long version: `-g` / `--galaxy-tool-path`
219 * Required: no.
220 * Taken values: The path relative to your `$GALAXY_ROOT/tools` folder on which your tools are located.
221
222 Example:
223
224 $ python generator.py ... -g my_tools_folder
225
226 Will generate `<tool>` elements in the generated `tool_conf.xml` as follows:
227
228 <tool file="my_tools_folder/some_tool.xml" />
229
230 In this example, `tool_conf.xml` refers to a file located on `$GALAXY_ROOT/tools/my_tools_folder/some_tool.xml`.
231
232
233 ### Hardcoding parameters
234
235 * Purpose: Fixing the value of a parameter and hide it from the end user.
236 * Short/long version: `-p` / `--hardcoded-parameters`
237 * Required: no.
238 * Taken values: The path of a file containing the mapping between parameter names and hardcoded values to use in the `<command>` section.
239
240 It is sometimes required that parameters are hidden from the end user in workflow systems such as Galaxy and that they take a predetermined value. Allowing end users to control parameters similar to `--verbosity`, `--threads`, etc., might create more problems than solving them. For this purpose, the parameter `p`/`--hardcoded-parameters` takes the path of a file that contains up to three columns separated by whitespace that map parameter names to the hardcoded value. The first column contains the name of the parameter and the second one the hardcoded value. The first two columns are mandatory.
241
242 If the parameter is to be hardcoded only for certain tools, a third column containing a comma separated list of tool names for which the hardcoding will apply can be added.
243
244 Lines starting with `#` will be ignored. The following is an example of a valid file:
245
246 # Parameter name # Value # Tool(s)
247 threads \${GALAXY_SLOTS:-24}
248 mode quiet
249 xtandem_executable xtandem XTandemAdapter
250 verbosity high Foo, Bar
251
252 This will produce a `<command>` section similar to the following one for all tools but `XTandemAdapter`, `Foo` and `Bar`:
253
254 <command>TOOL_NAME -threads \${GALAXY_SLOTS:-24} -mode quiet ...</command>
255
256 For `XTandemAdapter`, the `<command>` will be similar to:
257
258 <command>XtandemAdapter ... -threads \${GALAXY_SLOTS:-24} -mode quiet -xtandem_executable xtandem ...</command>
259
260 And for tools `Foo` and `Bar`, the `<command>` will be similar to:
261
262 <command>Foo ... ... -threads \${GALAXY_SLOTS:-24} -mode quiet -verbosity high ...</command>
263
264
265 ### Including additional macros files
266
267 * Purpose: Include external macros files.
268 * Short/long version: `-m` / `--macros`
269 * Required: no.
270 * Default: `macros.xml`
271 * Taken values: List of paths of macros files to include.
272
273 *ToolConfig* supports elaborate sections such as `<stdio>`, `<requirements>`, etc., that are identical across tools of the same suite. Macros files assist in the task of including external xml sections into *ToolConfig* files. For more information about the syntax of macros files, see: https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax#Reusing_Repeated_Configuration_Elements
274
275 There are some macros that are required, namely `stdio`, `requirements` and `advanced_options`. A template macro file is included in [macros.xml]. It can be edited to suit your needs and you could add extra macros or leave it as it is and include additional files.
276
277 Every macro found in the included files and in `support_files/macros.xml` will be expanded. Users are responsible for copying the given macros files in their corresponding galaxy folders.
278
279 ### Providing a default executable path
280
281 * Purpose: Help Galaxy locate tools by providing a path.
282 * Short/long version: `-x` / `--default-executable-path`
283 * Required: no.
284 * Taken values: The default executable path of the tools in the Galaxy installation.
285
286 CTDs can contain an `<executablePath>` element that will be used when executing the tool binary. If this element is missing, the value provided by this parameter will be used as a prefix when building the `<command>` section. Suppose that you have installed a tool suite in your local Galaxy instance under `/opt/suite/bin`. The following invocation of the converter:
287
288 $ python generator.py -x /opt/suite/bin ...
289
290 Will produce a `<command>` section similar to:
291
292 <command>/opt/suite/bin/Foo ...</command>
293
294 For those CTDs in which no `<executablePath>` could be found.
295
296
297 ### Generating a `datatypes_conf.xml` file
298
299 * Purpose: Specify the destination of a generated `datatypes_conf.xml` file.
300 * Short/long version: `-d` / `--datatypes-destination`
301 * Required: no.
302 * Taken values: The path in which `datatypes_conf.xml` will be generated.
303
304 It is likely that your tools use file formats or mimetypes that have not been registered in Galaxy. The generator allows you to specify a path in which an automatically generated `datatypes_conf.xml` file will be created. Consult the next section to get information about how to register file formats and mimetypes.
305
306
307 ### Providing Galaxy file formats
308
309 * Purpose: Register new file formats and mimetypes.
310 * Short/long version: `-f` / `--formats-file`
311 * Required: no.
312 * Taken values: The path of a file describing formats.
313
314 Galaxy supports the concept of file format in order to connect compatible ports, that is, input ports of a certain data format will be able to receive data from a port from the same format. This converter allows you to provide a personalized file in which you can relate the CTD data formats with supported Galaxy data formats. The format file is a simple text file, each line containing several columns separated by whitespace. The content of each column is as follows:
315
316 * 1st column: file extension, this column is required.
317 * 2nd column: data type, as listed in Galaxy, this column is optional.
318 * 3rd column: full-named Galaxy data type, as it will appear on datatypes_conf.xml; this column is required if the second column is included.
319 * 4th column: mimetype, this column is optional.
320
321 The following is an example of a valid "file formats" file:
322
323 # CTD type # Galaxy type # Long Galaxy data type # Mimetype
324 csv tabular galaxy.datatypes.data:Text
325 fasta
326 ini txt galaxy.datatypes.data:Text
327 txt
328 idxml txt galaxy.datatypes.xml:GenericXml application/xml
329 options txt galaxy.datatypes.data:Text
330 grid grid galaxy.datatypes.data:Grid
331
332 Note that each line consists of either one, three or four columns. In the case of data types already registered in Galaxy (such as `fasta` and `txt` in the above example), only the first column is needed. In the case of data types that haven't been yet registered in Galaxy, the first three columns are needed (mimetype is optional).
333
334 For information about Galaxy data types and subclasses, consult the following page: https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
335
336
337 ## Notes about some of the *OpenMS* tools
338
339 * Most of the tools can be generated automatically. Some of the tools need some extra work (for now).
340 * These adapters need to be changed, such that you provide the path to the executable:
341 * FidoAdapter (add `-exe fido` in the command tag, delete the `$param_exe` in the command tag, delete the parameter from the input list).
342 * MSGFPlusAdapter (add `-executable msgfplus.jar` in the command tag, delete the `$param_executable` in the command tag, delete the parameter from the input list).
343 * MyriMatchAdapter (add `-myrimatch_executable myrimatch` in the command tag, delete the `$param_myrimatch_executable` in the command tag, delete the parameter from the input list).
344 * OMSSAAdapter (add `-omssa_executable omssa` in the command tag, delete the `$param_omssa_executable` in the command tag, delete the parameter from the input list).
345 * PepNovoAdapter (add `-pepnovo_executable pepnovo` in the command tag, delete the `$param_pepnovo_executable` in the command tag, delete the parameter from the input list).
346 * XTandemAdapter (add `-xtandem_executable xtandem` in the command tag, delete the $param_xtandem_executable in the command tag, delete the parameter from the input list).
347 * To avoid the deletion in the inputs you can also add these parameters to the blacklist
348
349 $ python generator.py -b exe executable myrimatch_excutable omssa_executable pepnovo_executable xtandem_executable
350
351 * These tools have multiple outputs (number of inputs = number of outputs) which is not yet supported in Galaxy-stable:
352 * SeedListGenerator
353 * SpecLibSearcher
354 * MapAlignerIdentification
355 * MapAlignerPoseClustering
356 * MapAlignerSpectrum
357 * MapAlignerRTTransformer
35835
35936 [CTDopts]: https://github.com/genericworkflownodes/CTDopts
36037 [macros.xml]: https://github.com/WorkflowConversion/CTD2Galaxy/blob/master/macros.xml
+0
-2
dist/conda/bld.bat less more
0 "%PYTHON%" setup.py install
1 if errorlevel 1 exit 1
+0
-1
dist/conda/build.sh less more
0 $PYTHON setup.py install
+0
-28
dist/conda/meta.yaml less more
0 package:
1 name: ctd2galaxy
2 version: "1.0"
3
4 source:
5 git_rev: v1.0
6 git_url: https://github.com/WorkflowConversion/CTD2Galaxy.git
7
8 build:
9 noarch_python: True
10
11 requirements:
12 build:
13 - python
14 - setuptools
15
16 run:
17 - python
18 - lxml
19 - ctdopts 1.0
20
21 test:
22 imports:
23 - CTDopts.CTDopts
24
25 about:
26 home: https://github.com/WorkflowConversion/CTD2Galaxy
27 license_file: LICENSE
0 # Conversion of CTD Files to Galaxy ToolConfigs
1
2 ## How to use: most common Tasks
3
4 The Galaxy ToolConfig generator takes several parameters and a varying number of inputs and outputs. The following sub-sections show how to perform the most common operations.
5
6 Running the generator with the `-h/--help` parameter will print extended information about each of the parameters.
7
8 ### Macros
9
10 Galaxy supports the use of macros via a `macros.xml` file (`CTD2Galaxy` provides a sample macros file in [macros.xml]). Instead of repeating sections, macros can be used and expanded. If you want fine control over the macros, you can use the `-m` / `--macros` parameter to provide your own macros file.
11
12 Please note that the used macros file **must** be copied to your Galaxy installation on the same location in which you place the generated *ToolConfig* files, otherwise Galaxy will not be able to parse the generated *ToolConfig* files!
13
14 ### One input, one Output
15
16 In its simplest form, `CTD2Galaxy` takes an input CTD file and generates an output Galaxy *ToolConfig* file. The following use of `CTD2Galaxy`:
17
18 $ python generator.py -i /data/sample_input.ctd -o /data/sample_output.xml
19
20 will parse `/data/sample_input.ctd` and generate a Galaxy tool wrapper under `/data/sample_output.xml`. The generated file can be added to your Galaxy instance like any other tool.
21
22 ### Converting several CTDs at once
23
24 When converting several CTDs, the expected value for the `-o`/`--output` parameter is a folder. For example:
25
26 $ python generator.py -i /data/ctds/one.ctd /data/ctds/two.ctd -o /data/generated-galaxy-stubs
27
28 Will convert `/data/ctds/one.ctd` into `/data/generated-galaxy-stubs/one.xml` and `/data/ctds/two.ctd` into `/data/generated-galaxy-stubs/two.xml`.
29
30 You can use wildcard expansion, as supported by most modern operating systems:
31
32 $ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs
33
34 ### Generating a tool_conf.xml File
35
36 The generator supports generation of a `tool_conf.xml` file which you can later use in your local Galaxy installation. The parameter `-t`/`--tool-conf-destination` contains the path of a file in which a `tool_conf.xml` file will be generated.
37
38 $ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -t /data/generated-galaxy-stubs/tool_conf.xml
39
40
41 ## How to use: Parameters in Detail
42
43 ### A Word about Parameters taking Lists of Values
44
45 All parameters have a short and a long option and some parameters take list of values. Using either the long or the short option of the parameter will produce the same output. The following examples show how to pass values using the `-f` / `--foo` parameter:
46
47 The following uses of the parameter will pass the list of values containing `bar`, `blah` and `blu`:
48
49 -f bar blah blu
50 --foo bar blah blu
51 -f bar -f blah -f blu
52 --foo bar --foo blah --foo blu
53 -f bar --foo blah blu
54
55 The following uses of the parameter will pass a single value `bar`:
56
57 -f bar
58 --foo bar
59
60 ### Schema Validation
61
62 * Purpose: Provide validation of input CTDs against a schema file (i.e, a XSD file).
63 * Short/long version: `v` / `--validation-schema`
64 * Required: no.
65 * Taken values: location of the schema file (e.g., CTD.xsd).
66
67 CTDs can be validated against a schema. The master version of the schema can be found under [CTDSchema].
68
69 If a schema is provided, all input CTDs will be validated against it.
70
71 ### Input File(s)
72
73 * Purpose: Provide input CTD file(s) to convert.
74 * Short/long version: `-i` / `--input`
75 * Required: yes.
76 * Taken values: a list of input CTD files.
77
78 Example:
79
80 Any of the following invocations will convert `/data/input_one.ctd` and `/data/input_two.ctd`:
81
82 $ python generator.py -i /data/input_one.ctd -i /data/input_two.ctd -o /data/generated
83 $ python generator.py -i /data/input_one.ctd /data/input_two.ctd -o /data/generated
84 $ python generator.py --input /data/input_one.ctd /data/input_two.ctd -o /data/generated
85 $ python generator.py --input /data/input_one.ctd --input /data/input_two.ctd -o /data/generated
86
87 The following invocation will convert `/data/input.ctd` into `/data/output.xml`:
88
89 $ python generator.py -i /data/input.ctd -o /data/output.xml -m sample_files/macros.xml
90
91 Of course, you can also use wildcards, which will be automatically expanded by any modern operating system. This is extremely useful if you want to convert several files at a time. Imagine that the folder `/data/ctds` contains three files, `input_one.ctd`, `input_two.ctd` and `input_three.ctd`. The following two invocations will produce the same output in the `/data/galaxy`:
92
93 $ python generator.py -i /data/input_one.ctd /data/input_two.ctd /data/input_three.ctd -o /data/galaxy
94 $ python generator.py -i /data/*.ctd -o /data/galaxy
95
96 ### Finer Control over the Tools to be converted
97
98 Sometimes only a set of CTDs in a folder need to be converted. The parameter `-r`/`--required-tools` takes the path a file containing the names of tools that will be converted.
99
100 $ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -r required_tools.txt
101
102 On the other hand, if you want the generator to skip conversion of some CTDs, the parameter `-s`/`--skip-tools` will take the path of a file containing the names of tools that will not be converted.
103
104 $ python generator.py -i /data/ctds/*.ctd -o /data/generated-galaxy-stubs -s skipped_tools.txt
105
106 The format of these files (`required_tools.txt`, `skipped_tools.txt` in the examples above) is straightforward. Each line contains the name of a tool and any line starting with `#` will be ignored.
107
108 ### Output Destination
109
110 * Purpose: Provide output destination for the generated Galaxy *ToolConfig* files.
111 * Short/long version: `-o` / `--output-destination`
112 * Required: yes.
113 * Taken values: if a single input file is given, then a single output file is expected. If multiple input files are given, then an existent folder, in which all generated Galaxy *ToolConfig* will be written, is expected.
114
115 Example:
116
117 A single input is given, and the output will be generated into `/data/output.xml`:
118
119 $ python generator.py -i /data/input.ctd -o /data/output.xml
120
121 Several inputs are given. The output is the already existent folder, `/data/stubs`, and at the end of the operation, the files `/data/stubs/input_one.ctd.xml` and `/data/stubs/input_two.ctd.xml` will be generated:
122
123 $ python generator.py -i /data/ctds/input_one.ctd /data/ctds/input_two.ctd -o /data/stubs
124
125
126 ### Adding Parameters to the Command-line
127
128 * Purpose: Galaxy *ToolConfig* files include a `<command>` element in which the command line to invoke the tool can be given. Sometimes it is needed to invoke your tools in a certain way (i.e., passing certain parameters). For instance, some tools offer the possibility to be invoked in a verbose or quiet way or even to be invoked in a headless way (i.e., without GUI).
129 * Short/long version: `-a` / `--add-to-command-line`
130 * Required: no.
131 * Taken values: The command(s) to be added to the command line.
132
133 Example:
134
135 $ python generator.py ... -a "--quiet --no-gui"
136
137 Will generate the following `<command>` element in the generated Galaxy *ToolConfig*:
138
139 <command>TOOL_NAME --quiet --no-gui ...</command>
140
141
142 ### Blacklisting Parameters
143
144 * Purpose: Some parameters present in the CTD are not to be exposed on Galaxy. Think of parameters such as `--help`, `--debug`, that might won't make much sense to be exposed to final users in a workflow management system such as Galaxy.
145 * Short/long version: `-b` / `--blacklist-parameters`
146 * Required: no.
147 * Taken values: A list of parameters to be blacklisted.
148
149 Example:
150
151 $ python generator.py ... -b h help quiet
152
153 Will not process any of the parameters named `h`, `help`, or `quiet` and will not appear in the generated Galaxy *ToolConfig*.
154
155 ### Generating a tool_conf.xml file
156
157 * Purpose: Galaxy uses a file `tool_conf.xml` in which other tools can be included. `CTD2Galaxy` can also generate this file. Categories will be extracted from the provided input CTDs and for each category, a different `<section>` will be generated. Any input CTD lacking a category will be sorted under the provided default category.
158 * Short/long version: `-t` / `--tool-conf-destination`
159 * Required: no.
160 * Taken values: The destination of the file.
161
162 ### Providing a default Category
163
164 * Purpose: Input CTDs that lack a category will be sorted under the value given to this parameter. If this parameter is not given, then the category `DEFAULT` will be used.
165 * Short/long version: `-c` / `--default-category`
166 * Required: no.
167 * Taken values: The value for the default category to use for input CTDs lacking a category.
168
169 Example:
170
171 Suppose there is a folder containing several CTD files. Some of those CTDs don't have the optional attribute `category` and the rest belong to the `Data Processing` category. The following invocation:
172
173 $ python generator.py ... -c Other
174
175 will generate, for each of the categories, a different section. Additionally, CTDs lacking a category will be sorted under the given category, `Other`, as shown:
176
177 <section id="category-id-dataprocessing" name="Data Processing">
178 <tool file="some_path/tool_one.xml" />
179 <tool file="some_path/tool_two.xml" />
180 ...
181 </section>
182
183 <section id="category-id-other" name="Other">
184 <tool file="some_path/tool_three.xml" />
185 <tool file="some_path/tool_four.xml" />
186 ...
187 </section>
188
189 ### Providing a Path for the Location of the ToolConfig Files
190
191 * Purpose: The `tool_conf.xml` file contains references to files which in turn contain Galaxy *ToolConfig* files. Using this parameter, you can provide information about the location of your tools.
192 * Short/long version: `-g` / `--galaxy-tool-path`
193 * Required: no.
194 * Taken values: The path relative to your `$GALAXY_ROOT/tools` folder on which your tools are located.
195
196 Example:
197
198 $ python generator.py ... -g my_tools_folder
199
200 Will generate `<tool>` elements in the generated `tool_conf.xml` as follows:
201
202 <tool file="my_tools_folder/some_tool.xml" />
203
204 In this example, `tool_conf.xml` refers to a file located on `$GALAXY_ROOT/tools/my_tools_folder/some_tool.xml`.
205
206
207 ### Hardcoding Parameters
208
209 * Purpose: Fixing the value of a parameter and hide it from the end user.
210 * Short/long version: `-p` / `--hardcoded-parameters`
211 * Required: no.
212 * Taken values: The path of a file containing the mapping between parameter names and hardcoded values to use in the `<command>` section.
213
214 It is sometimes required that parameters are hidden from the end user in workflow systems such as Galaxy and that they take a predetermined value. Allowing end users to control parameters similar to `--verbosity`, `--threads`, etc., might create more problems than solving them. For this purpose, the parameter `p`/`--hardcoded-parameters` takes the path of a file that contains up to three columns separated by whitespace that map parameter names to the hardcoded value. The first column contains the name of the parameter and the second one the hardcoded value. The first two columns are mandatory.
215
216 If the parameter is to be hardcoded only for certain tools, a third column containing a comma separated list of tool names for which the hardcoding will apply can be added.
217
218 Lines starting with `#` will be ignored. The following is an example of a valid file:
219
220 # Parameter name # Value # Tool(s)
221 threads \${GALAXY_SLOTS:-24}
222 mode quiet
223 xtandem_executable xtandem XTandemAdapter
224 verbosity high Foo, Bar
225
226 This will produce a `<command>` section similar to the following one for all tools but `XTandemAdapter`, `Foo` and `Bar`:
227
228 <command>TOOL_NAME -threads \${GALAXY_SLOTS:-24} -mode quiet ...</command>
229
230 For `XTandemAdapter`, the `<command>` will be similar to:
231
232 <command>XtandemAdapter ... -threads \${GALAXY_SLOTS:-24} -mode quiet -xtandem_executable xtandem ...</command>
233
234 And for tools `Foo` and `Bar`, the `<command>` will be similar to:
235
236 <command>Foo ... ... -threads \${GALAXY_SLOTS:-24} -mode quiet -verbosity high ...</command>
237
238
239 ### Including additional Macros Files
240
241 * Purpose: Include external macros files.
242 * Short/long version: `-m` / `--macros`
243 * Required: no.
244 * Default: `macros.xml`
245 * Taken values: List of paths of macros files to include.
246
247 *ToolConfig* supports elaborate sections such as `<stdio>`, `<requirements>`, etc., that are identical across tools of the same suite. Macros files assist in the task of including external xml sections into *ToolConfig* files. For more information about the syntax of macros files, see: https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax#Reusing_Repeated_Configuration_Elements
248
249 There are some macros that are required, namely `stdio`, `requirements` and `advanced_options`. A template macro file is included in [macros.xml]. It can be edited to suit your needs and you could add extra macros or leave it as it is and include additional files.
250
251 Every macro found in the included files and in `support_files/macros.xml` will be expanded. Users are responsible for copying the given macros files in their corresponding galaxy folders.
252
253 ### Providing a default executable Path
254
255 * Purpose: Help Galaxy locate tools by providing a path.
256 * Short/long version: `-x` / `--default-executable-path`
257 * Required: no.
258 * Taken values: The default executable path of the tools in the Galaxy installation.
259
260 CTDs can contain an `<executablePath>` element that will be used when executing the tool binary. If this element is missing, the value provided by this parameter will be used as a prefix when building the `<command>` section. Suppose that you have installed a tool suite in your local Galaxy instance under `/opt/suite/bin`. The following invocation of the converter:
261
262 $ python generator.py -x /opt/suite/bin ...
263
264 Will produce a `<command>` section similar to:
265
266 <command>/opt/suite/bin/Foo ...</command>
267
268 For those CTDs in which no `<executablePath>` could be found.
269
270
271 ### Generating a `datatypes_conf.xml` File
272
273 * Purpose: Specify the destination of a generated `datatypes_conf.xml` file.
274 * Short/long version: `-d` / `--datatypes-destination`
275 * Required: no.
276 * Taken values: The path in which `datatypes_conf.xml` will be generated.
277
278 It is likely that your tools use file formats or mimetypes that have not been registered in Galaxy. The generator allows you to specify a path in which an automatically generated `datatypes_conf.xml` file will be created. Consult the next section to get information about how to register file formats and mimetypes.
279
280
281 ### Providing Galaxy File Formats
282
283 * Purpose: Register new file formats and mimetypes.
284 * Short/long version: `-f` / `--formats-file`
285 * Required: no.
286 * Taken values: The path of a file describing formats.
287
288 Galaxy supports the concept of file format in order to connect compatible ports, that is, input ports of a certain data format will be able to receive data from a port from the same format. This converter allows you to provide a personalized file in which you can relate the CTD data formats with supported Galaxy data formats. The format file is a simple text file, each line containing several columns separated by whitespace. The content of each column is as follows:
289
290 * 1st column: file extension, this column is required.
291 * 2nd column: data type, as listed in Galaxy, this column is optional.
292 * 3rd column: full-named Galaxy data type, as it will appear on datatypes_conf.xml; this column is required if the second column is included.
293 * 4th column: mimetype, this column is optional.
294
295 The following is an example of a valid "file formats" file:
296
297 # CTD type # Galaxy type # Long Galaxy data type # Mimetype
298 csv tabular galaxy.datatypes.data:Text
299 fasta
300 ini txt galaxy.datatypes.data:Text
301 txt
302 idxml txt galaxy.datatypes.xml:GenericXml application/xml
303 options txt galaxy.datatypes.data:Text
304 grid grid galaxy.datatypes.data:Grid
305
306 Note that each line consists of either one, three or four columns. In the case of data types already registered in Galaxy (such as `fasta` and `txt` in the above example), only the first column is needed. In the case of data types that haven't been yet registered in Galaxy, the first three columns are needed (mimetype is optional).
307
308 For information about Galaxy data types and subclasses, consult the following page: https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
309
310
311 ## Notes about some of the *OpenMS* Tools
312
313 * Most of the tools can be generated automatically. Some of the tools need some extra work (for now).
314 * These adapters need to be changed, such that you provide the path to the executable:
315 * FidoAdapter (add `-exe fido` in the command tag, delete the `$param_exe` in the command tag, delete the parameter from the input list).
316 * MSGFPlusAdapter (add `-executable msgfplus.jar` in the command tag, delete the `$param_executable` in the command tag, delete the parameter from the input list).
317 * MyriMatchAdapter (add `-myrimatch_executable myrimatch` in the command tag, delete the `$param_myrimatch_executable` in the command tag, delete the parameter from the input list).
318 * OMSSAAdapter (add `-omssa_executable omssa` in the command tag, delete the `$param_omssa_executable` in the command tag, delete the parameter from the input list).
319 * PepNovoAdapter (add `-pepnovo_executable pepnovo` in the command tag, delete the `$param_pepnovo_executable` in the command tag, delete the parameter from the input list).
320 * XTandemAdapter (add `-xtandem_executable xtandem` in the command tag, delete the $param_xtandem_executable in the command tag, delete the parameter from the input list).
321 * To avoid the deletion in the inputs you can also add these parameters to the blacklist
322
323 $ python generator.py -b exe executable myrimatch_excutable omssa_executable pepnovo_executable xtandem_executable
324
325 * These tools have multiple outputs (number of inputs = number of outputs) which is not yet supported in Galaxy-stable:
326 * SeedListGenerator
327 * SpecLibSearcher
328 * MapAlignerIdentification
329 * MapAlignerPoseClustering
330 * MapAlignerSpectrum
331 * MapAlignerRTTransformer
332
333 [CTDopts]: https://github.com/genericworkflownodes/CTDopts
334 [macros.xml]: https://github.com/WorkflowConversion/CTD2Galaxy/blob/master/macros.xml
335 [CTDSchema]: https://github.com/genericworkflownodes/CTDSchema
0 "%PYTHON%" setup.py install
1 if errorlevel 1 exit 1
0 $PYTHON setup.py install
0 package:
1 name: ctd2galaxy
2 version: "1.0"
3
4 source:
5 git_rev: v1.0
6 git_url: https://github.com/WorkflowConversion/CTD2Galaxy.git
7
8 build:
9 noarch_python: True
10
11 requirements:
12 build:
13 - python
14 - setuptools
15
16 run:
17 - python
18 - lxml
19 - ctdopts 1.0
20
21 test:
22 imports:
23 - CTDopts.CTDopts
24
25 about:
26 home: https://github.com/WorkflowConversion/CTD2Galaxy
27 license_file: LICENSE
0 #!/usr/bin/env python
1 # encoding: utf-8
2
3 """
4 @author: delagarza
5 """
6
7
8 import sys
9 import os
10 import traceback
11 import ntpath
12 import string
13
14 from argparse import ArgumentParser
15 from argparse import RawDescriptionHelpFormatter
16 from collections import OrderedDict
17 from string import strip
18 from lxml import etree
19 from lxml.etree import SubElement, Element, ElementTree, ParseError, parse
20
21 from CTDopts.CTDopts import CTDModel, _InFile, _OutFile, ParameterGroup, _Choices, _NumericRange, _FileFormat, \
22 ModelError, _Null
23
24 __all__ = []
25 __version__ = 1.0
26 __date__ = '2014-09-17'
27 __updated__ = '2016-05-09'
28
29 MESSAGE_INDENTATION_INCREMENT = 2
30
31 TYPE_TO_GALAXY_TYPE = {int: 'integer', float: 'float', str: 'text', bool: 'boolean', _InFile: 'data',
32 _OutFile: 'data', _Choices: 'select'}
33
34 STDIO_MACRO_NAME = "stdio"
35 REQUIREMENTS_MACRO_NAME = "requirements"
36 ADVANCED_OPTIONS_MACRO_NAME = "advanced_options"
37
38 REQUIRED_MACROS = [STDIO_MACRO_NAME, REQUIREMENTS_MACRO_NAME, ADVANCED_OPTIONS_MACRO_NAME]
39
40
41 class CLIError(Exception):
42 # Generic exception to raise and log different fatal errors.
43 def __init__(self, msg):
44 super(CLIError).__init__(type(self))
45 self.msg = "E: %s" % msg
46
47 def __str__(self):
48 return self.msg
49
50 def __unicode__(self):
51 return self.msg
52
53
54 class InvalidModelException(ModelError):
55 def __init__(self, message):
56 super(InvalidModelException, self).__init__()
57 self.message = message
58
59 def __str__(self):
60 return self.message
61
62 def __repr__(self):
63 return self.message
64
65
66 class ApplicationException(Exception):
67 def __init__(self, msg):
68 super(ApplicationException).__init__(type(self))
69 self.msg = msg
70
71 def __str__(self):
72 return self.msg
73
74 def __unicode__(self):
75 return self.msg
76
77
78 class ExitCode:
79 def __init__(self, code_range="", level="", description=None):
80 self.range = code_range
81 self.level = level
82 self.description = description
83
84
85 class DataType:
86 def __init__(self, extension, galaxy_extension=None, galaxy_type=None, mimetype=None):
87 self.extension = extension
88 self.galaxy_extension = galaxy_extension
89 self.galaxy_type = galaxy_type
90 self.mimetype = mimetype
91
92
93 class ParameterHardcoder:
94 def __init__(self):
95 # map whose keys are the composite names of tools and parameters in the following pattern:
96 # [ToolName][separator][ParameterName] -> HardcodedValue
97 # if the parameter applies to all tools, then the following pattern is used:
98 # [ParameterName] -> HardcodedValue
99
100 # examples (assuming separator is '#'):
101 # threads -> 24
102 # XtandemAdapter#adapter -> xtandem.exe
103 # adapter -> adapter.exe
104 self.separator = "!"
105 self.parameter_map = {}
106
107 # the most specific value will be returned in case of overlap
108 def get_hardcoded_value(self, parameter_name, tool_name):
109 # look for the value that would apply for all tools
110 generic_value = self.parameter_map.get(parameter_name, None)
111 specific_value = self.parameter_map.get(self.build_key(parameter_name, tool_name), None)
112 if specific_value is not None:
113 return specific_value
114
115 return generic_value
116
117 def register_parameter(self, parameter_name, parameter_value, tool_name=None):
118 self.parameter_map[self.build_key(parameter_name, tool_name)] = parameter_value
119
120 def build_key(self, parameter_name, tool_name):
121 if tool_name is None:
122 return parameter_name
123 return "%s%s%s" % (parameter_name, self.separator, tool_name)
124
125
126 def main(argv=None): # IGNORE:C0111
127 # Command line options.
128 if argv is None:
129 argv = sys.argv
130 else:
131 sys.argv.extend(argv)
132
133 program_version = "v%s" % __version__
134 program_build_date = str(__updated__)
135 program_version_message = '%%(prog)s %s (%s)' % (program_version, program_build_date)
136 program_short_description = "CTD2Galaxy - A project from the GenericWorkflowNodes family " \
137 "(https://github.com/orgs/genericworkflownodes)"
138 program_usage = '''
139 USAGE:
140
141 I - Parsing a single CTD file and generate a Galaxy wrapper:
142
143 $ python generator.py -i input.ctd -o output.xml
144
145
146 II - Parsing all found CTD files (files with .ctd and .xml extension) in a given folder and
147 output converted Galaxy wrappers in a given folder:
148
149 $ python generator.py -i /home/user/*.ctd -o /home/user/galaxywrappers
150
151
152 III - Providing file formats, mimetypes
153
154 Galaxy supports the concept of file format in order to connect compatible ports, that is, input ports of a certain
155 data format will be able to receive data from a port from the same format. This converter allows you to provide
156 a personalized file in which you can relate the CTD data formats with supported Galaxy data formats. The layout of
157 this file consists of lines, each of either one or four columns separated by any amount of whitespace. The content
158 of each column is as follows:
159
160 * 1st column: file extension
161 * 2nd column: data type, as listed in Galaxy
162 * 3rd column: full-named Galaxy data type, as it will appear on datatypes_conf.xml
163 * 4th column: mimetype (optional)
164
165 The following is an example of a valid "file formats" file:
166
167 ########################################## FILE FORMATS example ##########################################
168 # Every line starting with a # will be handled as a comment and will not be parsed.
169 # The first column is the file format as given in the CTD and second column is the Galaxy data format.
170 # The second, third, fourth and fifth column can be left empty if the data type has already been registered
171 # in Galaxy, otherwise, all but the mimetype must be provided.
172
173 # CTD type # Galaxy type # Long Galaxy data type # Mimetype
174 csv tabular galaxy.datatypes.data:Text
175 fasta
176 ini txt galaxy.datatypes.data:Text
177 txt
178 idxml txt galaxy.datatypes.xml:GenericXml application/xml
179 options txt galaxy.datatypes.data:Text
180 grid grid galaxy.datatypes.data:Grid
181
182 ##########################################################################################################
183
184 Note that each line consists precisely of either one, three or four columns. In the case of data types already
185 registered in Galaxy (such as fasta and txt in the above example), only the first column is needed. In the case of
186 data types that haven't been yet registered in Galaxy, the first three columns are needed (mimetype is optional).
187
188 For information about Galaxy data types and subclasses, see the following page:
189 https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
190
191
192 IV - Hardcoding parameters
193
194 It is possible to hardcode parameters. This makes sense if you want to set a tool in Galaxy in 'quiet' mode or if
195 your tools support multi-threading and accept the number of threads via a parameter, without giving the end user the
196 chance to change the values for these parameters.
197
198 In order to generate hardcoded parameters, you need to provide a simple file. Each line of this file contains two
199 or three columns separated by whitespace. Any line starting with a '#' will be ignored. The first column contains
200 the name of the parameter, the second column contains the value that will always be set for this parameter. The
201 first two columns are mandatory.
202
203 If the parameter is to be hardcoded only for a set of tools, then a third column can be added. This column includes
204 a comma-separated list of tool names for which the parameter will be hardcoded. If a third column is not included,
205 then all processed tools containing the given parameter will get a hardcoded value for it.
206
207 The following is an example of a valid file:
208
209 ##################################### HARDCODED PARAMETERS example #####################################
210 # Every line starting with a # will be handled as a comment and will not be parsed.
211 # The first column is the name of the parameter and the second column is the value that will be used.
212
213 # Parameter name # Value # Tool(s)
214 threads \${GALAXY_SLOTS:-24}
215 mode quiet
216 xtandem_executable xtandem XTandemAdapter
217 verbosity high Foo, Bar
218
219 #########################################################################################################
220
221 Using the above file will produce a <command> similar to:
222
223 [tool_name] ... -threads \${GALAXY_SLOTS:-24} -mode quiet ...
224
225 For all tools. For XTandemAdapter, the <command> will be similar to:
226
227 XtandemAdapter ... -threads \${GALAXY_SLOTS:-24} -mode quiet -xtandem_executable xtandem ...
228
229 And for tools Foo and Bar, the <command> will be similar to:
230
231 Foo ... ... -threads \${GALAXY_SLOTS:-24} -mode quiet -verbosity high ...
232
233
234 V - Control which tools will be converted
235
236 Sometimes only a subset of CTDs needs to be converted. It is possible to either explicitly specify which tools will
237 be converted or which tools will not be converted.
238
239 The value of the -s/--skip-tools parameter is a file in which each line will be interpreted as the name of a tool
240 that will not be converted. Conversely, the value of the -r/--required-tools is a file in which each line will be
241 interpreted as a tool that is required. Only one of these parameters can be specified at a given time.
242
243 The format of both files is exactly the same. As stated before, each line will be interpreted as the name of a tool;
244 any line starting with a '#' will be ignored.
245
246 '''
247 program_license = '''%(short_description)s
248 Copyright 2015, Luis de la Garza
249
250 Licensed under the Apache License, Version 2.0 (the "License");
251 you may not use this file except in compliance with the License.
252 You may obtain a copy of the License at
253
254 http://www.apache.org/licenses/LICENSE-2.0
255
256 Unless required by applicable law or agreed to in writing, software
257 distributed under the License is distributed on an "AS IS" BASIS,
258 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
259 See the License for the specific language governing permissions and
260 limitations under the License.
261
262 %(usage)s
263 ''' % {'short_description': program_short_description, 'usage': program_usage}
264
265 try:
266 # Setup argument parser
267 parser = ArgumentParser(prog="CTD2Galaxy", description=program_license,
268 formatter_class=RawDescriptionHelpFormatter, add_help=True)
269 parser.add_argument("-i", "--input", dest="input_files", default=[], required=True, nargs="+", action="append",
270 help="List of CTD files to convert.")
271 parser.add_argument("-o", "--output-destination", dest="output_destination", required=True,
272 help="If multiple input files are given, then a folder in which all generated "
273 "XMLs will be generated is expected;"
274 "if a single input file is given, then a destination file is expected.")
275 parser.add_argument("-f", "--formats-file", dest="formats_file",
276 help="File containing the supported file formats. Run with '-h' or '--help' to see a "
277 "brief example on the layout of this file.", default=None, required=False)
278 parser.add_argument("-a", "--add-to-command-line", dest="add_to_command_line",
279 help="Adds content to the command line", default="", required=False)
280 parser.add_argument("-d", "--datatypes-destination", dest="data_types_destination",
281 help="Specify the location of a datatypes_conf.xml to modify and add the registered "
282 "data types. If the provided destination does not exist, a new file will be created.",
283 default=None, required=False)
284 parser.add_argument("-x", "--default-executable-path", dest="default_executable_path",
285 help="Use this executable path when <executablePath> is not present in the CTD",
286 default=None, required=False)
287 parser.add_argument("-b", "--blacklist-parameters", dest="blacklisted_parameters", default=[], nargs="+", action="append",
288 help="List of parameters that will be ignored and won't appear on the galaxy stub",
289 required=False)
290 parser.add_argument("-c", "--default-category", dest="default_category", default="DEFAULT", required=False,
291 help="Default category to use for tools lacking a category when generating tool_conf.xml")
292 parser.add_argument("-t", "--tool-conf-destination", dest="tool_conf_destination", default=None, required=False,
293 help="Specify the location of an existing tool_conf.xml that will be modified to include "
294 "the converted tools. If the provided destination does not exist, a new file will"
295 "be created.")
296 parser.add_argument("-g", "--galaxy-tool-path", dest="galaxy_tool_path", default=None, required=False,
297 help="The path that will be prepended to the file names when generating tool_conf.xml")
298 parser.add_argument("-r", "--required-tools", dest="required_tools_file", default=None, required=False,
299 help="Each line of the file will be interpreted as a tool name that needs translation. "
300 "Run with '-h' or '--help' to see a brief example on the format of this file.")
301 parser.add_argument("-s", "--skip-tools", dest="skip_tools_file", default=None, required=False,
302 help="File containing a list of tools for which a Galaxy stub will not be generated. "
303 "Run with '-h' or '--help' to see a brief example on the format of this file.")
304 parser.add_argument("-m", "--macros", dest="macros_files", default=[], nargs="*",
305 action="append", required=None, help="Import the additional given file(s) as macros. "
306 "The macros stdio, requirements and advanced_options are required. Please see "
307 "macros.xml for an example of a valid macros file. Al defined macros will be imported.")
308 parser.add_argument("-p", "--hardcoded-parameters", dest="hardcoded_parameters", default=None, required=False,
309 help="File containing hardcoded values for the given parameters. Run with '-h' or '--help' "
310 "to see a brief example on the format of this file.")
311 parser.add_argument("-v", "--validation-schema", dest="xsd_location", default=None, required=False,
312 help="Location of the schema to use to validate CTDs.")
313
314 # TODO: add verbosity, maybe?
315 parser.add_argument("-V", "--version", action='version', version=program_version_message)
316
317 # Process arguments
318 args = parser.parse_args()
319
320 # validate and prepare the passed arguments
321 validate_and_prepare_args(args)
322
323 # extract the names of the macros and check that we have found the ones we need
324 macros_to_expand = parse_macros_files(args.macros_files)
325
326 # parse the given supported file-formats file
327 supported_file_formats = parse_file_formats(args.formats_file)
328
329 # parse the hardcoded parameters file¬
330 parameter_hardcoder = parse_hardcoded_parameters(args.hardcoded_parameters)
331
332 # parse the skip/required tools files
333 skip_tools = parse_tools_list_file(args.skip_tools_file)
334 required_tools = parse_tools_list_file(args.required_tools_file)
335
336 #if verbose > 0:
337 # print("Verbose mode on")
338 parsed_models = convert(args.input_files,
339 args.output_destination,
340 supported_file_formats=supported_file_formats,
341 default_executable_path=args.default_executable_path,
342 add_to_command_line=args.add_to_command_line,
343 blacklisted_parameters=args.blacklisted_parameters,
344 required_tools=required_tools,
345 skip_tools=skip_tools,
346 macros_file_names=args.macros_files,
347 macros_to_expand=macros_to_expand,
348 parameter_hardcoder=parameter_hardcoder,
349 xsd_location=args.xsd_location)
350
351 #TODO: add some sort of warning if a macro that doesn't exist is to be expanded
352
353 # it is not needed to copy the macros files, since the user has provided them
354
355 # generation of galaxy stubs is ready... now, let's see if we need to generate a tool_conf.xml
356 if args.tool_conf_destination is not None:
357 generate_tool_conf(parsed_models, args.tool_conf_destination,
358 args.galaxy_tool_path, args.default_category)
359
360 # now datatypes_conf.xml
361 if args.data_types_destination is not None:
362 generate_data_type_conf(supported_file_formats, args.data_types_destination)
363
364 return 0
365
366 except KeyboardInterrupt:
367 # handle keyboard interrupt
368 return 0
369 except ApplicationException, e:
370 error("CTD2Galaxy could not complete the requested operation.", 0)
371 error("Reason: " + e.msg, 0)
372 return 1
373 except ModelError, e:
374 error("There seems to be a problem with one of your input CTDs.", 0)
375 error("Reason: " + e.msg, 0)
376 return 1
377 except Exception, e:
378 traceback.print_exc()
379 return 2
380
381
382 def parse_tools_list_file(tools_list_file):
383 tools_list = None
384 if tools_list_file is not None:
385 tools_list = []
386 with open(tools_list_file) as f:
387 for line in f:
388 if line is None or not line.strip() or line.strip().startswith("#"):
389 continue
390 else:
391 tools_list.append(line.strip())
392
393 return tools_list
394
395
396 def parse_macros_files(macros_file_names):
397 macros_to_expand = set()
398
399 for macros_file_name in macros_file_names:
400 try:
401 macros_file = open(macros_file_name)
402 info("Loading macros from %s" % macros_file_name, 0)
403 root = parse(macros_file).getroot()
404 for xml_element in root.findall("xml"):
405 name = xml_element.attrib["name"]
406 if name in macros_to_expand:
407 warning("Macro %s has already been found. Duplicate found in file %s." %
408 (name, macros_file_name), 0)
409 else:
410 info("Macro %s found" % name, 1)
411 macros_to_expand.add(name)
412 except ParseError, e:
413 raise ApplicationException("The macros file " + macros_file_name + " could not be parsed. Cause: " +
414 str(e))
415 except IOError, e:
416 raise ApplicationException("The macros file " + macros_file_name + " could not be opened. Cause: " +
417 str(e))
418
419 # we depend on "stdio", "requirements" and "advanced_options" to exist on all the given macros files
420 missing_needed_macros = []
421 for required_macro in REQUIRED_MACROS:
422 if required_macro not in macros_to_expand:
423 missing_needed_macros.append(required_macro)
424
425 if missing_needed_macros:
426 raise ApplicationException(
427 "The following required macro(s) were not found in any of the given macros files: %s, "
428 "see sample_files/macros.xml for an example of a valid macros file."
429 % ", ".join(missing_needed_macros))
430
431 # we do not need to "expand" the advanced_options macro
432 macros_to_expand.remove(ADVANCED_OPTIONS_MACRO_NAME)
433 return macros_to_expand
434
435
436 def parse_hardcoded_parameters(hardcoded_parameters_file):
437 parameter_hardcoder = ParameterHardcoder()
438 if hardcoded_parameters_file is not None:
439 line_number = 0
440 with open(hardcoded_parameters_file) as f:
441 for line in f:
442 line_number += 1
443 if line is None or not line.strip() or line.strip().startswith("#"):
444 pass
445 else:
446 # the third column must not be obtained as a whole, and not split
447 parsed_hardcoded_parameter = line.strip().split(None, 2)
448 # valid lines contain two or three columns
449 if len(parsed_hardcoded_parameter) != 2 and len(parsed_hardcoded_parameter) != 3:
450 warning("Invalid line at line number %d of the given hardcoded parameters file. Line will be"
451 "ignored:\n%s" % (line_number, line), 0)
452 continue
453
454 parameter_name = parsed_hardcoded_parameter[0]
455 hardcoded_value = parsed_hardcoded_parameter[1]
456 tool_names = None
457 if len(parsed_hardcoded_parameter) == 3:
458 tool_names = parsed_hardcoded_parameter[2].split(',')
459 if tool_names:
460 for tool_name in tool_names:
461 parameter_hardcoder.register_parameter(parameter_name, hardcoded_value, tool_name.strip())
462 else:
463 parameter_hardcoder.register_parameter(parameter_name, hardcoded_value)
464
465 return parameter_hardcoder
466
467
468 def parse_file_formats(formats_file):
469 supported_formats = {}
470 if formats_file is not None:
471 line_number = 0
472 with open(formats_file) as f:
473 for line in f:
474 line_number += 1
475 if line is None or not line.strip() or line.strip().startswith("#"):
476 # ignore (it'd be weird to have something like:
477 # if line is not None and not (not line.strip()) ...
478 pass
479 else:
480 # not an empty line, no comment
481 # strip the line and split by whitespace
482 parsed_formats = line.strip().split()
483 # valid lines contain either one or four columns
484 if not (len(parsed_formats) == 1 or len(parsed_formats) == 3 or len(parsed_formats) == 4):
485 warning("Invalid line at line number %d of the given formats file. Line will be ignored:\n%s" %
486 (line_number, line), 0)
487 # ignore the line
488 continue
489 elif len(parsed_formats) == 1:
490 supported_formats[parsed_formats[0]] = DataType(parsed_formats[0], parsed_formats[0])
491 else:
492 mimetype = None
493 # check if mimetype was provided
494 if len(parsed_formats) == 4:
495 mimetype = parsed_formats[3]
496 supported_formats[parsed_formats[0]] = DataType(parsed_formats[0], parsed_formats[1],
497 parsed_formats[2], mimetype)
498 return supported_formats
499
500
501 def validate_and_prepare_args(args):
502 # check that only one of skip_tools_file and required_tools_file has been provided
503 if args.skip_tools_file is not None and args.required_tools_file is not None:
504 raise ApplicationException(
505 "You have provided both a file with tools to ignore and a file with required tools.\n"
506 "Only one of -s/--skip-tools, -r/--required-tools can be provided.")
507
508 # first, we convert all list of lists in args to flat lists
509 lists_to_flatten = ["input_files", "blacklisted_parameters", "macros_files"]
510 for list_to_flatten in lists_to_flatten:
511 setattr(args, list_to_flatten, [item for sub_list in getattr(args, list_to_flatten) for item in sub_list])
512
513 # if input is a single file, we expect output to be a file (and not a dir that already exists)
514 if len(args.input_files) == 1:
515 if os.path.isdir(args.output_destination):
516 raise ApplicationException("If a single input file is provided, output (%s) is expected to be a file "
517 "and not a folder.\n" % args.output_destination)
518
519 # if input is a list of files, we expect output to be a folder
520 if len(args.input_files) > 1:
521 if not os.path.isdir(args.output_destination):
522 raise ApplicationException("If several input files are provided, output (%s) is expected to be an "
523 "existing directory.\n" % args.output_destination)
524
525 # check that the provided input files, if provided, contain a valid file path
526 input_variables_to_check = ["skip_tools_file", "required_tools_file", "macros_files", "xsd_location",
527 "input_files", "formats_file", "hardcoded_parameters"]
528
529 for variable_name in input_variables_to_check:
530 paths_to_check = []
531 # check if we are handling a single file or a list of files
532 member_value = getattr(args, variable_name)
533 if member_value is not None:
534 if isinstance(member_value, list):
535 for file_name in member_value:
536 paths_to_check.append(strip(str(file_name)))
537 else:
538 paths_to_check.append(strip(str(member_value)))
539
540 for path_to_check in paths_to_check:
541 if not os.path.isfile(path_to_check) or not os.path.exists(path_to_check):
542 raise ApplicationException(
543 "The provided input file (%s) does not exist or is not a valid file path."
544 % path_to_check)
545
546 # check that the provided output files, if provided, contain a valid file path (i.e., not a folder)
547 output_variables_to_check = ["data_types_destination", "tool_conf_destination"]
548
549 for variable_name in output_variables_to_check:
550 file_name = getattr(args, variable_name)
551 if file_name is not None and os.path.isdir(file_name):
552 raise ApplicationException("The provided output file name (%s) points to a directory." % file_name)
553
554 if not args.macros_files:
555 # list is empty, provide the default value
556 warning("Using default macros from macros.xml", 0)
557 args.macros_files = ["macros.xml"]
558
559
560 def convert(input_files, output_destination, **kwargs):
561 # first, generate a model
562 is_converting_multiple_ctds = len(input_files) > 1
563 parsed_models = []
564 schema = None
565 if kwargs["xsd_location"] is not None:
566 try:
567 info("Loading validation schema from %s" % kwargs["xsd_location"], 0)
568 schema = etree.XMLSchema(etree.parse(kwargs["xsd_location"]))
569 except Exception, e:
570 error("Could not load validation schema %s. Reason: %s" % (kwargs["xsd_location"], str(e)), 0)
571 else:
572 info("Validation against a schema has not been enabled.", 0)
573 for input_file in input_files:
574 try:
575 if schema is not None:
576 validate_against_schema(input_file, schema)
577 model = CTDModel(from_file=input_file)
578 except Exception, e:
579 error(str(e), 1)
580 continue
581
582 if kwargs["skip_tools"] is not None and model.name in kwargs["skip_tools"]:
583 info("Skipping tool %s" % model.name, 0)
584 continue
585 elif kwargs["required_tools"] is not None and model.name not in kwargs["required_tools"]:
586 info("Tool %s is not required, skipping it" % model.name, 0)
587 continue
588 else:
589 info("Converting from %s " % input_file, 0)
590 tool = create_tool(model)
591 write_header(tool, model)
592 create_description(tool, model)
593 expand_macros(tool, model, **kwargs)
594 create_command(tool, model, **kwargs)
595 create_inputs(tool, model, **kwargs)
596 create_outputs(tool, model, **kwargs)
597 create_help(tool, model)
598
599 # finally, serialize the tool
600 output_file = output_destination
601 # if multiple inputs are being converted,
602 # then we need to generate a different output_file for each input
603 if is_converting_multiple_ctds:
604 output_file = os.path.join(output_file, get_filename_without_suffix(input_file) + ".xml")
605 # wrap our tool element into a tree to be able to serialize it
606 tree = ElementTree(tool)
607 tree.write(open(output_file, 'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
608 # let's use model to hold the name of the output file
609 parsed_models.append([model, get_filename(output_file)])
610
611 return parsed_models
612
613
614 # validates a ctd file against the schema
615 def validate_against_schema(ctd_file, schema):
616 try:
617 parser = etree.XMLParser(schema=schema)
618 etree.parse(ctd_file, parser=parser)
619 except etree.XMLSyntaxError, e:
620 raise ApplicationException("Input ctd file %s is not valid. Reason: %s" % (ctd_file, str(e)))
621
622
623 def write_header(tool, model):
624 tool.addprevious(etree.Comment(
625 "This is a configuration file for the integration of a tools into Galaxy (https://galaxyproject.org/). "
626 "This file was automatically generated using CTD2Galaxy."))
627 tool.addprevious(etree.Comment('Proposed Tool Section: [%s]' % model.opt_attribs.get("category", "")))
628
629
630 def generate_tool_conf(parsed_models, tool_conf_destination, galaxy_tool_path, default_category):
631 # for each category, we keep a list of models corresponding to it
632 categories_to_tools = dict()
633 for model in parsed_models:
634 category = strip(model[0].opt_attribs.get("category", ""))
635 if not category.strip():
636 category = default_category
637 if category not in categories_to_tools:
638 categories_to_tools[category] = []
639 categories_to_tools[category].append(model[1])
640
641 # at this point, we should have a map for all categories->tools
642 toolbox_node = Element("toolbox")
643
644 if galaxy_tool_path is not None and not galaxy_tool_path.strip().endswith("/"):
645 galaxy_tool_path = galaxy_tool_path.strip() + "/"
646 if galaxy_tool_path is None:
647 galaxy_tool_path = ""
648
649 for category, file_names in categories_to_tools.iteritems():
650 section_node = add_child_node(toolbox_node, "section")
651 section_node.attrib["id"] = "section-id-" + "".join(category.split())
652 section_node.attrib["name"] = category
653
654 for filename in file_names:
655 tool_node = add_child_node(section_node, "tool")
656 tool_node.attrib["file"] = galaxy_tool_path + filename
657
658 toolconf_tree = ElementTree(toolbox_node)
659 toolconf_tree.write(open(tool_conf_destination,'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
660 info("Generated Galaxy tool_conf.xml in %s" % tool_conf_destination, 0)
661
662
663 def generate_data_type_conf(supported_file_formats, data_types_destination):
664 data_types_node = Element("datatypes")
665 registration_node = add_child_node(data_types_node, "registration")
666 registration_node.attrib["converters_path"] = "lib/galaxy/datatypes/converters"
667 registration_node.attrib["display_path"] = "display_applications"
668
669 for format_name in supported_file_formats:
670 data_type = supported_file_formats[format_name]
671 # add only if it's a data type that does not exist in Galaxy
672 if data_type.galaxy_type is not None:
673 data_type_node = add_child_node(registration_node, "datatype")
674 # we know galaxy_extension is not None
675 data_type_node.attrib["extension"] = data_type.galaxy_extension
676 data_type_node.attrib["type"] = data_type.galaxy_type
677 if data_type.mimetype is not None:
678 data_type_node.attrib["mimetype"] = data_type.mimetype
679
680 data_types_tree = ElementTree(data_types_node)
681 data_types_tree.write(open(data_types_destination,'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
682 info("Generated Galaxy datatypes_conf.xml in %s" % data_types_destination, 0)
683
684
685 # taken from
686 # http://stackoverflow.com/questions/8384737/python-extract-file-name-from-path-no-matter-what-the-os-path-format
687 def get_filename(path):
688 head, tail = ntpath.split(path)
689 return tail or ntpath.basename(head)
690
691
692 def get_filename_without_suffix(path):
693 root, ext = os.path.splitext(os.path.basename(path))
694 return root
695
696
697 def create_tool(model):
698 return Element("tool", OrderedDict([("id", model.name), ("name", model.name), ("version", model.version)]))
699
700
701 def create_description(tool, model):
702 if "description" in model.opt_attribs.keys() and model.opt_attribs["description"] is not None:
703 description = SubElement(tool,"description")
704 description.text = model.opt_attribs["description"]
705
706
707 def get_param_cli_name(param, model):
708 # we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy)
709 if type(param.parent) == ParameterGroup:
710 if not hasattr(param.parent.parent, 'parent'):
711 return resolve_param_mapping(param, model)
712 elif not hasattr(param.parent.parent.parent, 'parent'):
713 return resolve_param_mapping(param, model)
714 else:
715 if model.cli:
716 warning("Using nested parameter sections (NODE elements) is not compatible with <cli>", py1)
717 return get_param_name(param.parent) + ":" + resolve_param_mapping(param, model)
718 else:
719 return resolve_param_mapping(param, model)
720
721
722 def get_param_name(param):
723 # we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy)
724 if type(param.parent) == ParameterGroup:
725 if not hasattr(param.parent.parent, 'parent'):
726 return param.name
727 elif not hasattr(param.parent.parent.parent, 'parent'):
728 return param.name
729 else:
730 return get_param_name(param.parent) + ":" + param.name
731 else:
732 return param.name
733
734
735 # some parameters are mapped to command line options, this method helps resolve those mappings, if any
736 def resolve_param_mapping(param, model):
737 # go through all mappings and find if the given param appears as a reference name in a mapping element
738 param_mapping = None
739 for cli_element in model.cli:
740 for mapping_element in cli_element.mappings:
741 if mapping_element.reference_name == param.name:
742 if param_mapping is not None:
743 warning("The parameter %s has more than one mapping in the <cli> section. "
744 "The first found mapping, %s, will be used." % (param.name, param_mapping), 1)
745 else:
746 param_mapping = cli_element.option_identifier
747
748 return param_mapping if param_mapping is not None else param.name
749
750 def create_command(tool, model, **kwargs):
751 final_command = get_tool_executable_path(model, kwargs["default_executable_path"]) + '\n'
752 final_command += kwargs["add_to_command_line"] + '\n'
753 advanced_command_start = "#if $adv_opts.adv_opts_selector=='advanced':\n"
754 advanced_command_end = '#end if'
755 advanced_command = ''
756 parameter_hardcoder = kwargs["parameter_hardcoder"]
757
758 found_output_parameter = False
759 for param in extract_parameters(model):
760 if param.type is _OutFile:
761 found_output_parameter = True
762 command = ''
763 param_name = get_param_name(param)
764 param_cli_name = get_param_cli_name(param, model)
765 if param_name == param_cli_name:
766 # there was no mapping, so for the cli name we will use a '-' in the prefix
767 param_cli_name = '-' + param_name
768
769 if param.name in kwargs["blacklisted_parameters"]:
770 continue
771
772 hardcoded_value = parameter_hardcoder.get_hardcoded_value(param_name, model.name)
773 if hardcoded_value:
774 command += '%s %s\n' % (param_cli_name, hardcoded_value)
775 else:
776 # parameter is neither blacklisted nor hardcoded...
777 galaxy_parameter_name = get_galaxy_parameter_name(param)
778 repeat_galaxy_parameter_name = get_repeat_galaxy_parameter_name(param)
779
780 # logic for ITEMLISTs
781 if param.is_list:
782 if param.type is _InFile:
783 command += param_cli_name + "\n"
784 command += " #for token in $" + galaxy_parameter_name + ":\n"
785 command += " $token\n"
786 command += " #end for\n"
787 else:
788 command += "\n#if $" + repeat_galaxy_parameter_name + ":\n"
789 command += param_cli_name + "\n"
790 command += " #for token in $" + repeat_galaxy_parameter_name + ":\n"
791 command += " #if \" \" in str(token):\n"
792 command += " \"$token." + galaxy_parameter_name + "\"\n"
793 command += " #else\n"
794 command += " $token." + galaxy_parameter_name + "\n"
795 command += " #end if\n"
796 command += " #end for\n"
797 command += "#end if\n"
798 # logic for other ITEMs
799 else:
800 if param.advanced and param.type is not _OutFile:
801 actual_parameter = "$adv_opts.%s" % galaxy_parameter_name
802 else:
803 actual_parameter = "$%s" % galaxy_parameter_name
804 ## if whitespace_validation has been set, we need to generate, for each parameter:
805 ## #if str( $t ).split() != '':
806 ## -t "$t"
807 ## #end if
808 ## TODO only useful for text fields, integers or floats
809 ## not useful for choices, input fields ...
810
811 if not is_boolean_parameter(param) and type(param.restrictions) is _Choices :
812 command += "#if " + actual_parameter + ":\n"
813 command += ' %s\n' % param_cli_name
814 command += " #if \" \" in str(" + actual_parameter + "):\n"
815 command += " \"" + actual_parameter + "\"\n"
816 command += " #else\n"
817 command += " " + actual_parameter + "\n"
818 command += " #end if\n"
819 command += "#end if\n"
820 elif is_boolean_parameter(param):
821 command += "#if " + actual_parameter + ":\n"
822 command += ' %s\n' % param_cli_name
823 command += "#end if\n"
824 elif TYPE_TO_GALAXY_TYPE[param.type] is 'text':
825 command += "#if " + actual_parameter + ":\n"
826 command += " %s " % param_cli_name
827 command += " \"" + actual_parameter + "\"\n"
828 command += "#end if\n"
829 else:
830 command += "#if " + actual_parameter + ":\n"
831 command += ' %s ' % param_cli_name
832 command += actual_parameter + "\n"
833 command += "#end if\n"
834
835 if param.advanced and param.type is not _OutFile:
836 advanced_command += " %s" % command
837 else:
838 final_command += command
839
840 if advanced_command:
841 final_command += "%s%s%s\n" % (advanced_command_start, advanced_command, advanced_command_end)
842
843 if not found_output_parameter:
844 final_command += "> $param_stdout\n"
845
846 command_node = add_child_node(tool, "command")
847 command_node.text = final_command
848
849
850 # creates the xml elements needed to import the needed macros files
851 # and to "expand" the macros
852 def expand_macros(tool, model, **kwargs):
853 macros_node = add_child_node(tool, "macros")
854 token_node = add_child_node(macros_node, "token")
855 token_node.attrib["name"] = "@EXECUTABLE@"
856 token_node.text = get_tool_executable_path(model, kwargs["default_executable_path"])
857
858 # add <import> nodes
859 for macro_file_name in kwargs["macros_file_names"]:
860 macro_file = open(macro_file_name)
861 import_node = add_child_node(macros_node, "import")
862 # do not add the path of the file, rather, just its basename
863 import_node.text = os.path.basename(macro_file.name)
864
865 # add <expand> nodes
866 for expand_macro in kwargs["macros_to_expand"]:
867 expand_node = add_child_node(tool, "expand")
868 expand_node.attrib["macro"] = expand_macro
869
870
871 def get_tool_executable_path(model, default_executable_path):
872 # rules to build the galaxy executable path:
873 # if executablePath is null, then use default_executable_path and store it in executablePath
874 # if executablePath is null and executableName is null, then the name of the tool will be used
875 # if executablePath is null and executableName is not null, then executableName will be used
876 # if executablePath is not null and executableName is null,
877 # then executablePath and the name of the tool will be used
878 # if executablePath is not null and executableName is not null, then both will be used
879
880 # first, check if the model has executablePath / executableName defined
881 executable_path = model.opt_attribs.get("executablePath", None)
882 executable_name = model.opt_attribs.get("executableName", None)
883
884 # check if we need to use the default_executable_path
885 if executable_path is None:
886 executable_path = default_executable_path
887
888 # fix the executablePath to make sure that there is a '/' in the end
889 if executable_path is not None:
890 executable_path = executable_path.strip()
891 if not executable_path.endswith('/'):
892 executable_path += '/'
893
894 # assume that we have all information present
895 command = str(executable_path) + str(executable_name)
896 if executable_path is None:
897 if executable_name is None:
898 command = model.name
899 else:
900 command = executable_name
901 else:
902 if executable_name is None:
903 command = executable_path + model.name
904 return command
905
906
907 def get_galaxy_parameter_name(param):
908 return "param_%s" % get_param_name(param).replace(':', '_').replace('-', '_')
909
910
911 def get_input_with_same_restrictions(out_param, model, supported_file_formats):
912 for param in extract_parameters(model):
913 if param.type is _InFile:
914 if param.restrictions is not None:
915 in_param_formats = get_supported_file_types(param.restrictions.formats, supported_file_formats)
916 out_param_formats = get_supported_file_types(out_param.restrictions.formats, supported_file_formats)
917 if in_param_formats == out_param_formats:
918 return param
919
920
921 def create_inputs(tool, model, **kwargs):
922 inputs_node = SubElement(tool, "inputs")
923
924 # some suites (such as OpenMS) need some advanced options when handling inputs
925 expand_advanced_node = add_child_node(tool, "expand", OrderedDict([("macro", ADVANCED_OPTIONS_MACRO_NAME)]))
926 parameter_hardcoder = kwargs["parameter_hardcoder"]
927
928 # treat all non output-file parameters as inputs
929 for param in extract_parameters(model):
930 # no need to show hardcoded parameters
931 hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name)
932 if param.name in kwargs["blacklisted_parameters"] or hardcoded_value:
933 # let's not use an extra level of indentation and use NOP
934 continue
935 if param.type is not _OutFile:
936 if param.advanced:
937 if expand_advanced_node is not None:
938 parent_node = expand_advanced_node
939 else:
940 # something went wrong... we are handling an advanced parameter and the
941 # advanced input macro was not set... inform the user about it
942 info("The parameter %s has been set as advanced, but advanced_input_macro has "
943 "not been set." % param.name, 1)
944 # there is not much we can do, other than use the inputs_node as a parent node!
945 parent_node = inputs_node
946 else:
947 parent_node = inputs_node
948
949 # for lists we need a repeat tag
950 if param.is_list and param.type is not _InFile:
951 rep_node = add_child_node(parent_node, "repeat")
952 create_repeat_attribute_list(rep_node, param)
953 parent_node = rep_node
954
955 param_node = add_child_node(parent_node, "param")
956 create_param_attribute_list(param_node, param, kwargs["supported_file_formats"])
957
958 # advanced parameter selection should be at the end
959 # and only available if an advanced parameter exists
960 if expand_advanced_node is not None and len(expand_advanced_node) > 0:
961 inputs_node.append(expand_advanced_node)
962
963
964 def get_repeat_galaxy_parameter_name(param):
965 return "rep_" + get_galaxy_parameter_name(param)
966
967
968 def create_repeat_attribute_list(rep_node, param):
969 rep_node.attrib["name"] = get_repeat_galaxy_parameter_name(param)
970 if param.required:
971 rep_node.attrib["min"] = "1"
972 else:
973 rep_node.attrib["min"] = "0"
974 # for the ITEMLISTs which have LISTITEM children we only
975 # need one parameter as it is given as a string
976 if param.default is not None:
977 rep_node.attrib["max"] = "1"
978 rep_node.attrib["title"] = get_galaxy_parameter_name(param)
979
980
981 def create_param_attribute_list(param_node, param, supported_file_formats):
982 param_node.attrib["name"] = get_galaxy_parameter_name(param)
983
984 param_type = TYPE_TO_GALAXY_TYPE[param.type]
985 if param_type is None:
986 raise ModelError("Unrecognized parameter type %(type)s for parameter %(name)s"
987 % {"type": param.type, "name": param.name})
988
989 if param.is_list:
990 param_type = "text"
991
992 if is_selection_parameter(param):
993 param_type = "select"
994 if len(param.restrictions.choices) < 5:
995 param_node.attrib["display"] = "radio"
996
997 if is_boolean_parameter(param):
998 param_type = "boolean"
999
1000 if param.type is _InFile:
1001 # assume it's just text unless restrictions are provided
1002 param_format = "text"
1003 if param.restrictions is not None:
1004 # join all formats of the file, take mapping from supported_file if available for an entry
1005 if type(param.restrictions) is _FileFormat:
1006 param_format = ','.join([get_supported_file_type(i, supported_file_formats) if
1007 get_supported_file_type(i, supported_file_formats)
1008 else i for i in param.restrictions.formats])
1009 else:
1010 raise InvalidModelException("Expected 'file type' restrictions for input file [%(name)s], "
1011 "but instead got [%(type)s]"
1012 % {"name": param.name, "type": type(param.restrictions)})
1013
1014 param_node.attrib["type"] = "data"
1015 param_node.attrib["format"] = param_format
1016 # in the case of multiple input set multiple flag
1017 if param.is_list:
1018 param_node.attrib["multiple"] = "true"
1019
1020 else:
1021 param_node.attrib["type"] = param_type
1022
1023 # check for parameters with restricted values (which will correspond to a "select" in galaxy)
1024 if param.restrictions is not None:
1025 # it could be either _Choices or _NumericRange, with special case for boolean types
1026 if param_type == "boolean":
1027 create_boolean_parameter(param_node, param)
1028 elif type(param.restrictions) is _Choices:
1029 # create as many <option> elements as restriction values
1030 for choice in param.restrictions.choices:
1031 option_node = add_child_node(param_node, "option", OrderedDict([("value", str(choice))]))
1032 option_node.text = str(choice)
1033
1034 # preselect the default value
1035 if param.default == choice:
1036 option_node.attrib["selected"] = "true"
1037
1038 elif type(param.restrictions) is _NumericRange:
1039 if param.type is not int and param.type is not float:
1040 raise InvalidModelException("Expected either 'int' or 'float' in the numeric range restriction for "
1041 "parameter [%(name)s], but instead got [%(type)s]" %
1042 {"name": param.name, "type": type(param.restrictions)})
1043 # extract the min and max values and add them as attributes
1044 # validate the provided min and max values
1045 if param.restrictions.n_min is not None:
1046 param_node.attrib["min"] = str(param.restrictions.n_min)
1047 if param.restrictions.n_max is not None:
1048 param_node.attrib["max"] = str(param.restrictions.n_max)
1049 elif type(param.restrictions) is _FileFormat:
1050 param_node.attrib["format"] = ','.join([get_supported_file_type(i, supported_file_formats) if
1051 get_supported_file_type(i, supported_file_formats)
1052 else i for i in param.restrictions.formats])
1053 else:
1054 raise InvalidModelException("Unrecognized restriction type [%(type)s] for parameter [%(name)s]"
1055 % {"type": type(param.restrictions), "name": param.name})
1056
1057 if param_type == "select" and param.default in param.restrictions.choices:
1058 param_node.attrib["optional"] = "False"
1059 else:
1060 param_node.attrib["optional"] = str(not param.required)
1061
1062 if param_type == "text":
1063 # add size attribute... this is the length of a textbox field in Galaxy (it could also be 15x2, for instance)
1064 param_node.attrib["size"] = "30"
1065 # add sanitizer nodes, this is needed for special character like "["
1066 # which are used for example by FeatureFinderMultiplex
1067 sanitizer_node = SubElement(param_node, "sanitizer")
1068
1069 valid_node = SubElement(sanitizer_node, "valid", OrderedDict([("initial", "string.printable")]))
1070 add_child_node(valid_node, "remove", OrderedDict([("value", '\'')]))
1071 add_child_node(valid_node, "remove", OrderedDict([("value", '"')]))
1072
1073 # check for default value
1074 if param.default is not None and param.default is not _Null:
1075 if type(param.default) is list:
1076 # we ASSUME that a list of parameters looks like:
1077 # $ tool -ignore He Ar Xe
1078 # meaning, that, for example, Helium, Argon and Xenon will be ignored
1079 param_node.attrib["value"] = ' '.join(map(str, param.default))
1080
1081 elif param_type != "boolean":
1082 param_node.attrib["value"] = str(param.default)
1083
1084 else:
1085 # simple boolean with a default
1086 if param.default is True:
1087 param_node.attrib["checked"] = "true"
1088 else:
1089 if param.type is int or param.type is float:
1090 # galaxy requires "value" to be included for int/float
1091 # since no default was included, we need to figure out one in a clever way... but let the user know
1092 # that we are "thinking" for him/her
1093 warning("Generating default value for parameter [%s]. "
1094 "Galaxy requires the attribute 'value' to be set for integer/floats. "
1095 "Edit the CTD file and provide a suitable default value." % param.name, 1)
1096 # check if there's a min/max and try to use them
1097 default_value = None
1098 if param.restrictions is not None:
1099 if type(param.restrictions) is _NumericRange:
1100 default_value = param.restrictions.n_min
1101 if default_value is None:
1102 default_value = param.restrictions.n_max
1103 if default_value is None:
1104 # no min/max provided... just use 0 and see what happens
1105 default_value = 0
1106 else:
1107 # should never be here, since we have validated this anyway...
1108 # this code is here just for documentation purposes
1109 # however, better safe than sorry!
1110 # (it could be that the code changes and then we have an ugly scenario)
1111 raise InvalidModelException("Expected either a numeric range for parameter [%(name)s], "
1112 "but instead got [%(type)s]"
1113 % {"name": param.name, "type": type(param.restrictions)})
1114 else:
1115 # no restrictions and no default value provided...
1116 # make up something
1117 default_value = 0
1118 param_node.attrib["value"] = str(default_value)
1119
1120 label = "%s parameter" % param.name
1121 help_text = ""
1122
1123 if param.description is not None:
1124 label, help_text = generate_label_and_help(param.description)
1125
1126 param_node.attrib["label"] = label
1127 param_node.attrib["help"] = "(-%s)" % param.name + " " + help_text
1128
1129
1130 def generate_label_and_help(desc):
1131 label = ""
1132 help_text = ""
1133 # This tag is found in some descriptions
1134 if not isinstance(desc, basestring):
1135 desc = str(desc)
1136 desc = desc.encode("utf8").replace("#br#", " <br>")
1137 # Get rid of dots in the end
1138 if desc.endswith("."):
1139 desc = desc.rstrip(".")
1140 # Check if first word is a normal word and make it uppercase
1141 if str(desc).find(" ") > -1:
1142 first_word, rest = str(desc).split(" ", 1)
1143 if str(first_word).islower():
1144 # check if label has a quotient of the form a/b
1145 if first_word.find("/") != 1 :
1146 first_word.capitalize()
1147 desc = first_word + " " + rest
1148 label = desc.decode("utf8")
1149
1150 # Try to split the label if it is too long
1151 if len(desc) > 50:
1152 # find an example and put everything before in the label and the e.g. in the help
1153 if desc.find("e.g.") > 1 :
1154 label, help_text = desc.split("e.g.",1)
1155 help_text = "e.g." + help_text
1156 else:
1157 # find the end of the first sentence
1158 # look for ". " because some labels contain .file or something similar
1159 delimiter = ""
1160 if desc.find(". ") > 1 and desc.find("? ") > 1:
1161 if desc.find(". ") < desc.find("? "):
1162 delimiter = ". "
1163 else:
1164 delimiter = "? "
1165 elif desc.find(". ") > 1:
1166 delimiter = ". "
1167 elif desc.find("? ") > 1:
1168 delimiter = "? "
1169 if delimiter != "":
1170 label, help_text = desc.split(delimiter, 1)
1171
1172 # add the question mark back
1173 if delimiter == "? ":
1174 label += "? "
1175
1176 # remove all linebreaks
1177 label = label.rstrip().rstrip('<br>').rstrip()
1178 return label, help_text
1179
1180
1181 def get_indented_text(text, indentation_level):
1182 return ("%(indentation)s%(text)s" %
1183 {"indentation": " " * (MESSAGE_INDENTATION_INCREMENT * indentation_level),
1184 "text": text})
1185
1186
1187 def warning(warning_text, indentation_level):
1188 sys.stdout.write(get_indented_text("WARNING: %s\n" % warning_text, indentation_level))
1189
1190
1191 def error(error_text, indentation_level):
1192 sys.stderr.write(get_indented_text("ERROR: %s\n" % error_text, indentation_level))
1193
1194
1195 def info(info_text, indentation_level):
1196 sys.stdout.write(get_indented_text("INFO: %s\n" % info_text, indentation_level))
1197
1198
1199 # determines if the given choices are boolean (basically, if the possible values are yes/no, true/false)
1200 def is_boolean_parameter(param):
1201 ## detect boolean selects of OpenMS
1202 if is_selection_parameter(param):
1203 if len(param.restrictions.choices) == 2:
1204 # check that default value is false to make sure it is an actual flag
1205 if "false" in param.restrictions.choices and \
1206 "true" in param.restrictions.choices and \
1207 param.default == "false":
1208 return True
1209 else:
1210 return param.type is bool
1211
1212
1213 # determines if there are choices for the parameter
1214 def is_selection_parameter(param):
1215 return type(param.restrictions) is _Choices
1216
1217
1218 def get_lowercase_list(some_list):
1219 lowercase_list = map(str, some_list)
1220 lowercase_list = map(string.lower, lowercase_list)
1221 lowercase_list = map(strip, lowercase_list)
1222 return lowercase_list
1223
1224
1225 # creates a galaxy boolean parameter type
1226 # this method assumes that param has restrictions, and that only two restictions are present
1227 # (either yes/no or true/false)
1228 def create_boolean_parameter(param_node, param):
1229 # first, determine the 'truevalue' and the 'falsevalue'
1230 """TODO: true and false values can be way more than 'true' and 'false'
1231 but for that we need CTD support
1232 """
1233 # by default, 'true' and 'false' are handled as flags, like the verbose flag (i.e., -v)
1234 true_value = "-%s" % get_param_name(param)
1235 false_value = ""
1236 choices = get_lowercase_list(param.restrictions.choices)
1237 if "yes" in choices:
1238 true_value = "yes"
1239 false_value = "no"
1240 param_node.attrib["truevalue"] = true_value
1241 param_node.attrib["falsevalue"] = false_value
1242
1243 # set the checked attribute
1244 if param.default is not None:
1245 checked_value = "false"
1246 default = strip(string.lower(param.default))
1247 if default == "yes" or default == "true":
1248 checked_value = "true"
1249 #attribute_list["checked"] = checked_value
1250 param_node.attrib["checked"] = checked_value
1251
1252
1253 def create_outputs(parent, model, **kwargs):
1254 outputs_node = add_child_node(parent, "outputs")
1255 parameter_hardcoder = kwargs["parameter_hardcoder"]
1256
1257 for param in extract_parameters(model):
1258
1259 # no need to show hardcoded parameters
1260 hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name)
1261 if param.name in kwargs["blacklisted_parameters"] or hardcoded_value:
1262 # let's not use an extra level of indentation and use NOP
1263 continue
1264 if param.type is _OutFile:
1265 create_output_node(outputs_node, param, model, kwargs["supported_file_formats"])
1266
1267 # If there are no outputs defined in the ctd the node will have no children
1268 # and the stdout will be used as output
1269 if len(outputs_node) == 0:
1270 add_child_node(outputs_node, "data",
1271 OrderedDict([("name", "param_stdout"), ("format", "text"), ("label", "Output from stdout")]))
1272
1273
1274 def create_output_node(parent, param, model, supported_file_formats):
1275 data_node = add_child_node(parent, "data")
1276 data_node.attrib["name"] = get_galaxy_parameter_name(param)
1277
1278 data_format = "data"
1279 if param.restrictions is not None:
1280 if type(param.restrictions) is _FileFormat:
1281 # set the first data output node to the first file format
1282
1283 # check if there are formats that have not been registered yet...
1284 output = list()
1285 for format_name in param.restrictions.formats:
1286 if not format_name in supported_file_formats.keys():
1287 output.append(str(format_name))
1288
1289 # warn only if there's about to complain
1290 if output:
1291 warning("Parameter " + param.name + " has the following unsupported format(s):" + ','.join(output), 1)
1292 data_format = ','.join(output)
1293
1294 formats = get_supported_file_types(param.restrictions.formats, supported_file_formats)
1295 try:
1296 data_format = formats.pop()
1297 except KeyError:
1298 # there is not much we can do, other than catching the exception
1299 pass
1300 # if there are more than one output file formats try to take the format from the input parameter
1301 if formats:
1302 corresponding_input = get_input_with_same_restrictions(param, model, supported_file_formats)
1303 if corresponding_input is not None:
1304 data_format = "input"
1305 data_node.attrib["metadata_source"] = get_galaxy_parameter_name(corresponding_input)
1306 else:
1307 raise InvalidModelException("Unrecognized restriction type [%(type)s] "
1308 "for output [%(name)s]" % {"type": type(param.restrictions),
1309 "name": param.name})
1310 data_node.attrib["format"] = data_format
1311
1312 #TODO: find a smarter label ?
1313 #if param.description is not None:
1314 # data_node.setAttribute("label", param.description)
1315 return data_node
1316
1317
1318 # Get the supported file format for one given format
1319 def get_supported_file_type(format_name, supported_file_formats):
1320 if format_name in supported_file_formats.keys():
1321 return supported_file_formats.get(format_name, DataType(format_name, format_name)).galaxy_extension
1322 else:
1323 return None
1324
1325
1326 def get_supported_file_types(formats, supported_file_formats):
1327 return set([supported_file_formats.get(format_name, DataType(format_name, format_name)).galaxy_extension
1328 for format_name in formats if format_name in supported_file_formats.keys()])
1329
1330
1331 def create_change_format_node(parent, data_formats, input_ref):
1332 # <change_format>
1333 # <when input="secondary_structure" value="true" format="text"/>
1334 # </change_format>
1335 change_format_node = add_child_node(parent, "change_format")
1336 for data_format in data_formats:
1337 add_child_node(change_format_node, "when",
1338 OrderedDict([("input", input_ref), ("value", data_format), ("format", data_format)]))
1339
1340
1341 # Shows basic information about the file, such as data ranges and file type.
1342 def create_help(tool, model):
1343 manual = ''
1344 doc_url = None
1345 if 'manual' in model.opt_attribs.keys():
1346 manual += '%s\n\n' % model.opt_attribs["manual"]
1347 if 'docurl' in model.opt_attribs.keys():
1348 doc_url = model.opt_attribs["docurl"]
1349
1350 help_text = "No help available"
1351 if manual is not None:
1352 help_text = manual
1353 if doc_url is not None:
1354 help_text = ("" if manual is None else manual) + "\nFor more information, visit %s" % doc_url
1355 help_node = add_child_node(tool, "help")
1356 # TODO: do we need CDATA Section here?
1357 help_node.text = help_text
1358
1359
1360 # since a model might contain several ParameterGroup elements,
1361 # we want to simply 'flatten' the parameters to generate the Galaxy wrapper
1362 def extract_parameters(model):
1363 parameters = []
1364 if len(model.parameters.parameters) > 0:
1365 # use this to put parameters that are to be processed
1366 # we know that CTDModel has one parent ParameterGroup
1367 pending = [model.parameters]
1368 while len(pending) > 0:
1369 # take one element from 'pending'
1370 parameter = pending.pop()
1371 if type(parameter) is not ParameterGroup:
1372 parameters.append(parameter)
1373 else:
1374 # append the first-level children of this ParameterGroup
1375 pending.extend(parameter.parameters.values())
1376 # returned the reversed list of parameters (as it is now,
1377 # we have the last parameter in the CTD as first in the list)
1378 return reversed(parameters)
1379
1380
1381 # adds and returns a child node using the given name to the given parent node
1382 def add_child_node(parent_node, child_node_name, attributes=OrderedDict([])):
1383 child_node = SubElement(parent_node, child_node_name, attributes)
1384 return child_node
1385
1386
1387 if __name__ == "__main__":
1388 sys.exit(main())
0 <?xml version='1.0' encoding='UTF-8'?>
1 <!-- CTD2Galaxy depends on this file and on the stdio, advanced_options macros!
2 You can edit this file to add your own macros, if you so desire, or you can
3 add additional macro files using the m/macros parameter -->
4 <macros>
5 <xml name="requirements">
6 <requirements>
7 <requirement type="binary">@EXECUTABLE@</requirement>
8 </requirements>
9 </xml>
10 <xml name="stdio">
11 <stdio>
12 <exit_code range="1:"/>
13 <exit_code range=":-1"/>
14 <regex match="Error:"/>
15 <regex match="Exception:"/>
16 </stdio>
17 </xml>
18 <xml name="advanced_options">
19 <conditional name="adv_opts">
20 <param name="adv_opts_selector" type="select" label="Advanced Options">
21 <option value="basic" selected="True">Hide Advanced Options</option>
22 <option value="advanced">Show Advanced Options</option>
23 </param>
24 <when value="basic"/>
25 <when value="advanced">
26 <yield/>
27 </when>
28 </conditional>
29 </xml>
30 </macros>
+0
-1389
generator.py less more
0 #!/usr/bin/env python
1 # encoding: utf-8
2
3 """
4 @author: delagarza
5 """
6
7
8 import sys
9 import os
10 import traceback
11 import ntpath
12 import string
13
14 from argparse import ArgumentParser
15 from argparse import RawDescriptionHelpFormatter
16 from collections import OrderedDict
17 from string import strip
18 from lxml import etree
19 from lxml.etree import SubElement, Element, ElementTree, ParseError, parse
20
21 from CTDopts.CTDopts import CTDModel, _InFile, _OutFile, ParameterGroup, _Choices, _NumericRange, _FileFormat, \
22 ModelError, _Null
23
24 __all__ = []
25 __version__ = 1.0
26 __date__ = '2014-09-17'
27 __updated__ = '2016-05-09'
28
29 MESSAGE_INDENTATION_INCREMENT = 2
30
31 TYPE_TO_GALAXY_TYPE = {int: 'integer', float: 'float', str: 'text', bool: 'boolean', _InFile: 'data',
32 _OutFile: 'data', _Choices: 'select'}
33
34 STDIO_MACRO_NAME = "stdio"
35 REQUIREMENTS_MACRO_NAME = "requirements"
36 ADVANCED_OPTIONS_MACRO_NAME = "advanced_options"
37
38 REQUIRED_MACROS = [STDIO_MACRO_NAME, REQUIREMENTS_MACRO_NAME, ADVANCED_OPTIONS_MACRO_NAME]
39
40
41 class CLIError(Exception):
42 # Generic exception to raise and log different fatal errors.
43 def __init__(self, msg):
44 super(CLIError).__init__(type(self))
45 self.msg = "E: %s" % msg
46
47 def __str__(self):
48 return self.msg
49
50 def __unicode__(self):
51 return self.msg
52
53
54 class InvalidModelException(ModelError):
55 def __init__(self, message):
56 super(InvalidModelException, self).__init__()
57 self.message = message
58
59 def __str__(self):
60 return self.message
61
62 def __repr__(self):
63 return self.message
64
65
66 class ApplicationException(Exception):
67 def __init__(self, msg):
68 super(ApplicationException).__init__(type(self))
69 self.msg = msg
70
71 def __str__(self):
72 return self.msg
73
74 def __unicode__(self):
75 return self.msg
76
77
78 class ExitCode:
79 def __init__(self, code_range="", level="", description=None):
80 self.range = code_range
81 self.level = level
82 self.description = description
83
84
85 class DataType:
86 def __init__(self, extension, galaxy_extension=None, galaxy_type=None, mimetype=None):
87 self.extension = extension
88 self.galaxy_extension = galaxy_extension
89 self.galaxy_type = galaxy_type
90 self.mimetype = mimetype
91
92
93 class ParameterHardcoder:
94 def __init__(self):
95 # map whose keys are the composite names of tools and parameters in the following pattern:
96 # [ToolName][separator][ParameterName] -> HardcodedValue
97 # if the parameter applies to all tools, then the following pattern is used:
98 # [ParameterName] -> HardcodedValue
99
100 # examples (assuming separator is '#'):
101 # threads -> 24
102 # XtandemAdapter#adapter -> xtandem.exe
103 # adapter -> adapter.exe
104 self.separator = "!"
105 self.parameter_map = {}
106
107 # the most specific value will be returned in case of overlap
108 def get_hardcoded_value(self, parameter_name, tool_name):
109 # look for the value that would apply for all tools
110 generic_value = self.parameter_map.get(parameter_name, None)
111 specific_value = self.parameter_map.get(self.build_key(parameter_name, tool_name), None)
112 if specific_value is not None:
113 return specific_value
114
115 return generic_value
116
117 def register_parameter(self, parameter_name, parameter_value, tool_name=None):
118 self.parameter_map[self.build_key(parameter_name, tool_name)] = parameter_value
119
120 def build_key(self, parameter_name, tool_name):
121 if tool_name is None:
122 return parameter_name
123 return "%s%s%s" % (parameter_name, self.separator, tool_name)
124
125
126 def main(argv=None): # IGNORE:C0111
127 # Command line options.
128 if argv is None:
129 argv = sys.argv
130 else:
131 sys.argv.extend(argv)
132
133 program_version = "v%s" % __version__
134 program_build_date = str(__updated__)
135 program_version_message = '%%(prog)s %s (%s)' % (program_version, program_build_date)
136 program_short_description = "CTD2Galaxy - A project from the GenericWorkflowNodes family " \
137 "(https://github.com/orgs/genericworkflownodes)"
138 program_usage = '''
139 USAGE:
140
141 I - Parsing a single CTD file and generate a Galaxy wrapper:
142
143 $ python generator.py -i input.ctd -o output.xml
144
145
146 II - Parsing all found CTD files (files with .ctd and .xml extension) in a given folder and
147 output converted Galaxy wrappers in a given folder:
148
149 $ python generator.py -i /home/user/*.ctd -o /home/user/galaxywrappers
150
151
152 III - Providing file formats, mimetypes
153
154 Galaxy supports the concept of file format in order to connect compatible ports, that is, input ports of a certain
155 data format will be able to receive data from a port from the same format. This converter allows you to provide
156 a personalized file in which you can relate the CTD data formats with supported Galaxy data formats. The layout of
157 this file consists of lines, each of either one or four columns separated by any amount of whitespace. The content
158 of each column is as follows:
159
160 * 1st column: file extension
161 * 2nd column: data type, as listed in Galaxy
162 * 3rd column: full-named Galaxy data type, as it will appear on datatypes_conf.xml
163 * 4th column: mimetype (optional)
164
165 The following is an example of a valid "file formats" file:
166
167 ########################################## FILE FORMATS example ##########################################
168 # Every line starting with a # will be handled as a comment and will not be parsed.
169 # The first column is the file format as given in the CTD and second column is the Galaxy data format.
170 # The second, third, fourth and fifth column can be left empty if the data type has already been registered
171 # in Galaxy, otherwise, all but the mimetype must be provided.
172
173 # CTD type # Galaxy type # Long Galaxy data type # Mimetype
174 csv tabular galaxy.datatypes.data:Text
175 fasta
176 ini txt galaxy.datatypes.data:Text
177 txt
178 idxml txt galaxy.datatypes.xml:GenericXml application/xml
179 options txt galaxy.datatypes.data:Text
180 grid grid galaxy.datatypes.data:Grid
181
182 ##########################################################################################################
183
184 Note that each line consists precisely of either one, three or four columns. In the case of data types already
185 registered in Galaxy (such as fasta and txt in the above example), only the first column is needed. In the case of
186 data types that haven't been yet registered in Galaxy, the first three columns are needed (mimetype is optional).
187
188 For information about Galaxy data types and subclasses, see the following page:
189 https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
190
191
192 IV - Hardcoding parameters
193
194 It is possible to hardcode parameters. This makes sense if you want to set a tool in Galaxy in 'quiet' mode or if
195 your tools support multi-threading and accept the number of threads via a parameter, without giving the end user the
196 chance to change the values for these parameters.
197
198 In order to generate hardcoded parameters, you need to provide a simple file. Each line of this file contains two
199 or three columns separated by whitespace. Any line starting with a '#' will be ignored. The first column contains
200 the name of the parameter, the second column contains the value that will always be set for this parameter. The
201 first two columns are mandatory.
202
203 If the parameter is to be hardcoded only for a set of tools, then a third column can be added. This column includes
204 a comma-separated list of tool names for which the parameter will be hardcoded. If a third column is not included,
205 then all processed tools containing the given parameter will get a hardcoded value for it.
206
207 The following is an example of a valid file:
208
209 ##################################### HARDCODED PARAMETERS example #####################################
210 # Every line starting with a # will be handled as a comment and will not be parsed.
211 # The first column is the name of the parameter and the second column is the value that will be used.
212
213 # Parameter name # Value # Tool(s)
214 threads \${GALAXY_SLOTS:-24}
215 mode quiet
216 xtandem_executable xtandem XTandemAdapter
217 verbosity high Foo, Bar
218
219 #########################################################################################################
220
221 Using the above file will produce a <command> similar to:
222
223 [tool_name] ... -threads \${GALAXY_SLOTS:-24} -mode quiet ...
224
225 For all tools. For XTandemAdapter, the <command> will be similar to:
226
227 XtandemAdapter ... -threads \${GALAXY_SLOTS:-24} -mode quiet -xtandem_executable xtandem ...
228
229 And for tools Foo and Bar, the <command> will be similar to:
230
231 Foo ... ... -threads \${GALAXY_SLOTS:-24} -mode quiet -verbosity high ...
232
233
234 V - Control which tools will be converted
235
236 Sometimes only a subset of CTDs needs to be converted. It is possible to either explicitly specify which tools will
237 be converted or which tools will not be converted.
238
239 The value of the -s/--skip-tools parameter is a file in which each line will be interpreted as the name of a tool
240 that will not be converted. Conversely, the value of the -r/--required-tools is a file in which each line will be
241 interpreted as a tool that is required. Only one of these parameters can be specified at a given time.
242
243 The format of both files is exactly the same. As stated before, each line will be interpreted as the name of a tool;
244 any line starting with a '#' will be ignored.
245
246 '''
247 program_license = '''%(short_description)s
248 Copyright 2015, Luis de la Garza
249
250 Licensed under the Apache License, Version 2.0 (the "License");
251 you may not use this file except in compliance with the License.
252 You may obtain a copy of the License at
253
254 http://www.apache.org/licenses/LICENSE-2.0
255
256 Unless required by applicable law or agreed to in writing, software
257 distributed under the License is distributed on an "AS IS" BASIS,
258 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
259 See the License for the specific language governing permissions and
260 limitations under the License.
261
262 %(usage)s
263 ''' % {'short_description': program_short_description, 'usage': program_usage}
264
265 try:
266 # Setup argument parser
267 parser = ArgumentParser(prog="CTD2Galaxy", description=program_license,
268 formatter_class=RawDescriptionHelpFormatter, add_help=True)
269 parser.add_argument("-i", "--input", dest="input_files", default=[], required=True, nargs="+", action="append",
270 help="List of CTD files to convert.")
271 parser.add_argument("-o", "--output-destination", dest="output_destination", required=True,
272 help="If multiple input files are given, then a folder in which all generated "
273 "XMLs will be generated is expected;"
274 "if a single input file is given, then a destination file is expected.")
275 parser.add_argument("-f", "--formats-file", dest="formats_file",
276 help="File containing the supported file formats. Run with '-h' or '--help' to see a "
277 "brief example on the layout of this file.", default=None, required=False)
278 parser.add_argument("-a", "--add-to-command-line", dest="add_to_command_line",
279 help="Adds content to the command line", default="", required=False)
280 parser.add_argument("-d", "--datatypes-destination", dest="data_types_destination",
281 help="Specify the location of a datatypes_conf.xml to modify and add the registered "
282 "data types. If the provided destination does not exist, a new file will be created.",
283 default=None, required=False)
284 parser.add_argument("-x", "--default-executable-path", dest="default_executable_path",
285 help="Use this executable path when <executablePath> is not present in the CTD",
286 default=None, required=False)
287 parser.add_argument("-b", "--blacklist-parameters", dest="blacklisted_parameters", default=[], nargs="+", action="append",
288 help="List of parameters that will be ignored and won't appear on the galaxy stub",
289 required=False)
290 parser.add_argument("-c", "--default-category", dest="default_category", default="DEFAULT", required=False,
291 help="Default category to use for tools lacking a category when generating tool_conf.xml")
292 parser.add_argument("-t", "--tool-conf-destination", dest="tool_conf_destination", default=None, required=False,
293 help="Specify the location of an existing tool_conf.xml that will be modified to include "
294 "the converted tools. If the provided destination does not exist, a new file will"
295 "be created.")
296 parser.add_argument("-g", "--galaxy-tool-path", dest="galaxy_tool_path", default=None, required=False,
297 help="The path that will be prepended to the file names when generating tool_conf.xml")
298 parser.add_argument("-r", "--required-tools", dest="required_tools_file", default=None, required=False,
299 help="Each line of the file will be interpreted as a tool name that needs translation. "
300 "Run with '-h' or '--help' to see a brief example on the format of this file.")
301 parser.add_argument("-s", "--skip-tools", dest="skip_tools_file", default=None, required=False,
302 help="File containing a list of tools for which a Galaxy stub will not be generated. "
303 "Run with '-h' or '--help' to see a brief example on the format of this file.")
304 parser.add_argument("-m", "--macros", dest="macros_files", default=[], nargs="*",
305 action="append", required=None, help="Import the additional given file(s) as macros. "
306 "The macros stdio, requirements and advanced_options are required. Please see "
307 "macros.xml for an example of a valid macros file. Al defined macros will be imported.")
308 parser.add_argument("-p", "--hardcoded-parameters", dest="hardcoded_parameters", default=None, required=False,
309 help="File containing hardcoded values for the given parameters. Run with '-h' or '--help' "
310 "to see a brief example on the format of this file.")
311 parser.add_argument("-v", "--validation-schema", dest="xsd_location", default=None, required=False,
312 help="Location of the schema to use to validate CTDs.")
313
314 # TODO: add verbosity, maybe?
315 parser.add_argument("-V", "--version", action='version', version=program_version_message)
316
317 # Process arguments
318 args = parser.parse_args()
319
320 # validate and prepare the passed arguments
321 validate_and_prepare_args(args)
322
323 # extract the names of the macros and check that we have found the ones we need
324 macros_to_expand = parse_macros_files(args.macros_files)
325
326 # parse the given supported file-formats file
327 supported_file_formats = parse_file_formats(args.formats_file)
328
329 # parse the hardcoded parameters file¬
330 parameter_hardcoder = parse_hardcoded_parameters(args.hardcoded_parameters)
331
332 # parse the skip/required tools files
333 skip_tools = parse_tools_list_file(args.skip_tools_file)
334 required_tools = parse_tools_list_file(args.required_tools_file)
335
336 #if verbose > 0:
337 # print("Verbose mode on")
338 parsed_models = convert(args.input_files,
339 args.output_destination,
340 supported_file_formats=supported_file_formats,
341 default_executable_path=args.default_executable_path,
342 add_to_command_line=args.add_to_command_line,
343 blacklisted_parameters=args.blacklisted_parameters,
344 required_tools=required_tools,
345 skip_tools=skip_tools,
346 macros_file_names=args.macros_files,
347 macros_to_expand=macros_to_expand,
348 parameter_hardcoder=parameter_hardcoder,
349 xsd_location=args.xsd_location)
350
351 #TODO: add some sort of warning if a macro that doesn't exist is to be expanded
352
353 # it is not needed to copy the macros files, since the user has provided them
354
355 # generation of galaxy stubs is ready... now, let's see if we need to generate a tool_conf.xml
356 if args.tool_conf_destination is not None:
357 generate_tool_conf(parsed_models, args.tool_conf_destination,
358 args.galaxy_tool_path, args.default_category)
359
360 # now datatypes_conf.xml
361 if args.data_types_destination is not None:
362 generate_data_type_conf(supported_file_formats, args.data_types_destination)
363
364 return 0
365
366 except KeyboardInterrupt:
367 # handle keyboard interrupt
368 return 0
369 except ApplicationException, e:
370 error("CTD2Galaxy could not complete the requested operation.", 0)
371 error("Reason: " + e.msg, 0)
372 return 1
373 except ModelError, e:
374 error("There seems to be a problem with one of your input CTDs.", 0)
375 error("Reason: " + e.msg, 0)
376 return 1
377 except Exception, e:
378 traceback.print_exc()
379 return 2
380
381
382 def parse_tools_list_file(tools_list_file):
383 tools_list = None
384 if tools_list_file is not None:
385 tools_list = []
386 with open(tools_list_file) as f:
387 for line in f:
388 if line is None or not line.strip() or line.strip().startswith("#"):
389 continue
390 else:
391 tools_list.append(line.strip())
392
393 return tools_list
394
395
396 def parse_macros_files(macros_file_names):
397 macros_to_expand = set()
398
399 for macros_file_name in macros_file_names:
400 try:
401 macros_file = open(macros_file_name)
402 info("Loading macros from %s" % macros_file_name, 0)
403 root = parse(macros_file).getroot()
404 for xml_element in root.findall("xml"):
405 name = xml_element.attrib["name"]
406 if name in macros_to_expand:
407 warning("Macro %s has already been found. Duplicate found in file %s." %
408 (name, macros_file_name), 0)
409 else:
410 info("Macro %s found" % name, 1)
411 macros_to_expand.add(name)
412 except ParseError, e:
413 raise ApplicationException("The macros file " + macros_file_name + " could not be parsed. Cause: " +
414 str(e))
415 except IOError, e:
416 raise ApplicationException("The macros file " + macros_file_name + " could not be opened. Cause: " +
417 str(e))
418
419 # we depend on "stdio", "requirements" and "advanced_options" to exist on all the given macros files
420 missing_needed_macros = []
421 for required_macro in REQUIRED_MACROS:
422 if required_macro not in macros_to_expand:
423 missing_needed_macros.append(required_macro)
424
425 if missing_needed_macros:
426 raise ApplicationException(
427 "The following required macro(s) were not found in any of the given macros files: %s, "
428 "see sample_files/macros.xml for an example of a valid macros file."
429 % ", ".join(missing_needed_macros))
430
431 # we do not need to "expand" the advanced_options macro
432 macros_to_expand.remove(ADVANCED_OPTIONS_MACRO_NAME)
433 return macros_to_expand
434
435
436 def parse_hardcoded_parameters(hardcoded_parameters_file):
437 parameter_hardcoder = ParameterHardcoder()
438 if hardcoded_parameters_file is not None:
439 line_number = 0
440 with open(hardcoded_parameters_file) as f:
441 for line in f:
442 line_number += 1
443 if line is None or not line.strip() or line.strip().startswith("#"):
444 pass
445 else:
446 # the third column must not be obtained as a whole, and not split
447 parsed_hardcoded_parameter = line.strip().split(None, 2)
448 # valid lines contain two or three columns
449 if len(parsed_hardcoded_parameter) != 2 and len(parsed_hardcoded_parameter) != 3:
450 warning("Invalid line at line number %d of the given hardcoded parameters file. Line will be"
451 "ignored:\n%s" % (line_number, line), 0)
452 continue
453
454 parameter_name = parsed_hardcoded_parameter[0]
455 hardcoded_value = parsed_hardcoded_parameter[1]
456 tool_names = None
457 if len(parsed_hardcoded_parameter) == 3:
458 tool_names = parsed_hardcoded_parameter[2].split(',')
459 if tool_names:
460 for tool_name in tool_names:
461 parameter_hardcoder.register_parameter(parameter_name, hardcoded_value, tool_name.strip())
462 else:
463 parameter_hardcoder.register_parameter(parameter_name, hardcoded_value)
464
465 return parameter_hardcoder
466
467
468 def parse_file_formats(formats_file):
469 supported_formats = {}
470 if formats_file is not None:
471 line_number = 0
472 with open(formats_file) as f:
473 for line in f:
474 line_number += 1
475 if line is None or not line.strip() or line.strip().startswith("#"):
476 # ignore (it'd be weird to have something like:
477 # if line is not None and not (not line.strip()) ...
478 pass
479 else:
480 # not an empty line, no comment
481 # strip the line and split by whitespace
482 parsed_formats = line.strip().split()
483 # valid lines contain either one or four columns
484 if not (len(parsed_formats) == 1 or len(parsed_formats) == 3 or len(parsed_formats) == 4):
485 warning("Invalid line at line number %d of the given formats file. Line will be ignored:\n%s" %
486 (line_number, line), 0)
487 # ignore the line
488 continue
489 elif len(parsed_formats) == 1:
490 supported_formats[parsed_formats[0]] = DataType(parsed_formats[0], parsed_formats[0])
491 else:
492 mimetype = None
493 # check if mimetype was provided
494 if len(parsed_formats) == 4:
495 mimetype = parsed_formats[3]
496 supported_formats[parsed_formats[0]] = DataType(parsed_formats[0], parsed_formats[1],
497 parsed_formats[2], mimetype)
498 return supported_formats
499
500
501 def validate_and_prepare_args(args):
502 # check that only one of skip_tools_file and required_tools_file has been provided
503 if args.skip_tools_file is not None and args.required_tools_file is not None:
504 raise ApplicationException(
505 "You have provided both a file with tools to ignore and a file with required tools.\n"
506 "Only one of -s/--skip-tools, -r/--required-tools can be provided.")
507
508 # first, we convert all list of lists in args to flat lists
509 lists_to_flatten = ["input_files", "blacklisted_parameters", "macros_files"]
510 for list_to_flatten in lists_to_flatten:
511 setattr(args, list_to_flatten, [item for sub_list in getattr(args, list_to_flatten) for item in sub_list])
512
513 # if input is a single file, we expect output to be a file (and not a dir that already exists)
514 if len(args.input_files) == 1:
515 if os.path.isdir(args.output_destination):
516 raise ApplicationException("If a single input file is provided, output (%s) is expected to be a file "
517 "and not a folder.\n" % args.output_destination)
518
519 # if input is a list of files, we expect output to be a folder
520 if len(args.input_files) > 1:
521 if not os.path.isdir(args.output_destination):
522 raise ApplicationException("If several input files are provided, output (%s) is expected to be an "
523 "existing directory.\n" % args.output_destination)
524
525 # check that the provided input files, if provided, contain a valid file path
526 input_variables_to_check = ["skip_tools_file", "required_tools_file", "macros_files", "xsd_location",
527 "input_files", "formats_file", "hardcoded_parameters"]
528
529 for variable_name in input_variables_to_check:
530 paths_to_check = []
531 # check if we are handling a single file or a list of files
532 member_value = getattr(args, variable_name)
533 if member_value is not None:
534 if isinstance(member_value, list):
535 for file_name in member_value:
536 paths_to_check.append(strip(str(file_name)))
537 else:
538 paths_to_check.append(strip(str(member_value)))
539
540 for path_to_check in paths_to_check:
541 if not os.path.isfile(path_to_check) or not os.path.exists(path_to_check):
542 raise ApplicationException(
543 "The provided input file (%s) does not exist or is not a valid file path."
544 % path_to_check)
545
546 # check that the provided output files, if provided, contain a valid file path (i.e., not a folder)
547 output_variables_to_check = ["data_types_destination", "tool_conf_destination"]
548
549 for variable_name in output_variables_to_check:
550 file_name = getattr(args, variable_name)
551 if file_name is not None and os.path.isdir(file_name):
552 raise ApplicationException("The provided output file name (%s) points to a directory." % file_name)
553
554 if not args.macros_files:
555 # list is empty, provide the default value
556 warning("Using default macros from macros.xml", 0)
557 args.macros_files = ["macros.xml"]
558
559
560 def convert(input_files, output_destination, **kwargs):
561 # first, generate a model
562 is_converting_multiple_ctds = len(input_files) > 1
563 parsed_models = []
564 schema = None
565 if kwargs["xsd_location"] is not None:
566 try:
567 info("Loading validation schema from %s" % kwargs["xsd_location"], 0)
568 schema = etree.XMLSchema(etree.parse(kwargs["xsd_location"]))
569 except Exception, e:
570 error("Could not load validation schema %s. Reason: %s" % (kwargs["xsd_location"], str(e)), 0)
571 else:
572 info("Validation against a schema has not been enabled.", 0)
573 for input_file in input_files:
574 try:
575 if schema is not None:
576 validate_against_schema(input_file, schema)
577 model = CTDModel(from_file=input_file)
578 except Exception, e:
579 error(str(e), 1)
580 continue
581
582 if kwargs["skip_tools"] is not None and model.name in kwargs["skip_tools"]:
583 info("Skipping tool %s" % model.name, 0)
584 continue
585 elif kwargs["required_tools"] is not None and model.name not in kwargs["required_tools"]:
586 info("Tool %s is not required, skipping it" % model.name, 0)
587 continue
588 else:
589 info("Converting from %s " % input_file, 0)
590 tool = create_tool(model)
591 write_header(tool, model)
592 create_description(tool, model)
593 expand_macros(tool, model, **kwargs)
594 create_command(tool, model, **kwargs)
595 create_inputs(tool, model, **kwargs)
596 create_outputs(tool, model, **kwargs)
597 create_help(tool, model)
598
599 # finally, serialize the tool
600 output_file = output_destination
601 # if multiple inputs are being converted,
602 # then we need to generate a different output_file for each input
603 if is_converting_multiple_ctds:
604 output_file = os.path.join(output_file, get_filename_without_suffix(input_file) + ".xml")
605 # wrap our tool element into a tree to be able to serialize it
606 tree = ElementTree(tool)
607 tree.write(open(output_file, 'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
608 # let's use model to hold the name of the output file
609 parsed_models.append([model, get_filename(output_file)])
610
611 return parsed_models
612
613
614 # validates a ctd file against the schema
615 def validate_against_schema(ctd_file, schema):
616 try:
617 parser = etree.XMLParser(schema=schema)
618 etree.parse(ctd_file, parser=parser)
619 except etree.XMLSyntaxError, e:
620 raise ApplicationException("Input ctd file %s is not valid. Reason: %s" % (ctd_file, str(e)))
621
622
623 def write_header(tool, model):
624 tool.addprevious(etree.Comment(
625 "This is a configuration file for the integration of a tools into Galaxy (https://galaxyproject.org/). "
626 "This file was automatically generated using CTD2Galaxy."))
627 tool.addprevious(etree.Comment('Proposed Tool Section: [%s]' % model.opt_attribs.get("category", "")))
628
629
630 def generate_tool_conf(parsed_models, tool_conf_destination, galaxy_tool_path, default_category):
631 # for each category, we keep a list of models corresponding to it
632 categories_to_tools = dict()
633 for model in parsed_models:
634 category = strip(model[0].opt_attribs.get("category", ""))
635 if not category.strip():
636 category = default_category
637 if category not in categories_to_tools:
638 categories_to_tools[category] = []
639 categories_to_tools[category].append(model[1])
640
641 # at this point, we should have a map for all categories->tools
642 toolbox_node = Element("toolbox")
643
644 if galaxy_tool_path is not None and not galaxy_tool_path.strip().endswith("/"):
645 galaxy_tool_path = galaxy_tool_path.strip() + "/"
646 if galaxy_tool_path is None:
647 galaxy_tool_path = ""
648
649 for category, file_names in categories_to_tools.iteritems():
650 section_node = add_child_node(toolbox_node, "section")
651 section_node.attrib["id"] = "section-id-" + "".join(category.split())
652 section_node.attrib["name"] = category
653
654 for filename in file_names:
655 tool_node = add_child_node(section_node, "tool")
656 tool_node.attrib["file"] = galaxy_tool_path + filename
657
658 toolconf_tree = ElementTree(toolbox_node)
659 toolconf_tree.write(open(tool_conf_destination,'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
660 info("Generated Galaxy tool_conf.xml in %s" % tool_conf_destination, 0)
661
662
663 def generate_data_type_conf(supported_file_formats, data_types_destination):
664 data_types_node = Element("datatypes")
665 registration_node = add_child_node(data_types_node, "registration")
666 registration_node.attrib["converters_path"] = "lib/galaxy/datatypes/converters"
667 registration_node.attrib["display_path"] = "display_applications"
668
669 for format_name in supported_file_formats:
670 data_type = supported_file_formats[format_name]
671 # add only if it's a data type that does not exist in Galaxy
672 if data_type.galaxy_type is not None:
673 data_type_node = add_child_node(registration_node, "datatype")
674 # we know galaxy_extension is not None
675 data_type_node.attrib["extension"] = data_type.galaxy_extension
676 data_type_node.attrib["type"] = data_type.galaxy_type
677 if data_type.mimetype is not None:
678 data_type_node.attrib["mimetype"] = data_type.mimetype
679
680 data_types_tree = ElementTree(data_types_node)
681 data_types_tree.write(open(data_types_destination,'w'), encoding="UTF-8", xml_declaration=True, pretty_print=True)
682 info("Generated Galaxy datatypes_conf.xml in %s" % data_types_destination, 0)
683
684
685 # taken from
686 # http://stackoverflow.com/questions/8384737/python-extract-file-name-from-path-no-matter-what-the-os-path-format
687 def get_filename(path):
688 head, tail = ntpath.split(path)
689 return tail or ntpath.basename(head)
690
691
692 def get_filename_without_suffix(path):
693 root, ext = os.path.splitext(os.path.basename(path))
694 return root
695
696
697 def create_tool(model):
698 return Element("tool", OrderedDict([("id", model.name), ("name", model.name), ("version", model.version)]))
699
700
701 def create_description(tool, model):
702 if "description" in model.opt_attribs.keys() and model.opt_attribs["description"] is not None:
703 description = SubElement(tool,"description")
704 description.text = model.opt_attribs["description"]
705
706
707 def get_param_cli_name(param, model):
708 # we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy)
709 if type(param.parent) == ParameterGroup:
710 if not hasattr(param.parent.parent, 'parent'):
711 return resolve_param_mapping(param, model)
712 elif not hasattr(param.parent.parent.parent, 'parent'):
713 return resolve_param_mapping(param, model)
714 else:
715 if model.cli:
716 warning("Using nested parameter sections (NODE elements) is not compatible with <cli>", py1)
717 return get_param_name(param.parent) + ":" + resolve_param_mapping(param, model)
718 else:
719 return resolve_param_mapping(param, model)
720
721
722 def get_param_name(param):
723 # we generate parameters with colons for subgroups, but not for the two topmost parents (OpenMS legacy)
724 if type(param.parent) == ParameterGroup:
725 if not hasattr(param.parent.parent, 'parent'):
726 return param.name
727 elif not hasattr(param.parent.parent.parent, 'parent'):
728 return param.name
729 else:
730 return get_param_name(param.parent) + ":" + param.name
731 else:
732 return param.name
733
734
735 # some parameters are mapped to command line options, this method helps resolve those mappings, if any
736 def resolve_param_mapping(param, model):
737 # go through all mappings and find if the given param appears as a reference name in a mapping element
738 param_mapping = None
739 for cli_element in model.cli:
740 for mapping_element in cli_element.mappings:
741 if mapping_element.reference_name == param.name:
742 if param_mapping is not None:
743 warning("The parameter %s has more than one mapping in the <cli> section. "
744 "The first found mapping, %s, will be used." % (param.name, param_mapping), 1)
745 else:
746 param_mapping = cli_element.option_identifier
747
748 return param_mapping if param_mapping is not None else param.name
749
750 def create_command(tool, model, **kwargs):
751 final_command = get_tool_executable_path(model, kwargs["default_executable_path"]) + '\n'
752 final_command += kwargs["add_to_command_line"] + '\n'
753 advanced_command_start = "#if $adv_opts.adv_opts_selector=='advanced':\n"
754 advanced_command_end = '#end if'
755 advanced_command = ''
756 parameter_hardcoder = kwargs["parameter_hardcoder"]
757
758 found_output_parameter = False
759 for param in extract_parameters(model):
760 if param.type is _OutFile:
761 found_output_parameter = True
762 command = ''
763 param_name = get_param_name(param)
764 param_cli_name = get_param_cli_name(param, model)
765 if param_name == param_cli_name:
766 # there was no mapping, so for the cli name we will use a '-' in the prefix
767 param_cli_name = '-' + param_name
768
769 if param.name in kwargs["blacklisted_parameters"]:
770 continue
771
772 hardcoded_value = parameter_hardcoder.get_hardcoded_value(param_name, model.name)
773 if hardcoded_value:
774 command += '%s %s\n' % (param_cli_name, hardcoded_value)
775 else:
776 # parameter is neither blacklisted nor hardcoded...
777 galaxy_parameter_name = get_galaxy_parameter_name(param)
778 repeat_galaxy_parameter_name = get_repeat_galaxy_parameter_name(param)
779
780 # logic for ITEMLISTs
781 if param.is_list:
782 if param.type is _InFile:
783 command += param_cli_name + "\n"
784 command += " #for token in $" + galaxy_parameter_name + ":\n"
785 command += " $token\n"
786 command += " #end for\n"
787 else:
788 command += "\n#if $" + repeat_galaxy_parameter_name + ":\n"
789 command += param_cli_name + "\n"
790 command += " #for token in $" + repeat_galaxy_parameter_name + ":\n"
791 command += " #if \" \" in str(token):\n"
792 command += " \"$token." + galaxy_parameter_name + "\"\n"
793 command += " #else\n"
794 command += " $token." + galaxy_parameter_name + "\n"
795 command += " #end if\n"
796 command += " #end for\n"
797 command += "#end if\n"
798 # logic for other ITEMs
799 else:
800 if param.advanced and param.type is not _OutFile:
801 actual_parameter = "$adv_opts.%s" % galaxy_parameter_name
802 else:
803 actual_parameter = "$%s" % galaxy_parameter_name
804 ## if whitespace_validation has been set, we need to generate, for each parameter:
805 ## #if str( $t ).split() != '':
806 ## -t "$t"
807 ## #end if
808 ## TODO only useful for text fields, integers or floats
809 ## not useful for choices, input fields ...
810
811 if not is_boolean_parameter(param) and type(param.restrictions) is _Choices :
812 command += "#if " + actual_parameter + ":\n"
813 command += ' %s\n' % param_cli_name
814 command += " #if \" \" in str(" + actual_parameter + "):\n"
815 command += " \"" + actual_parameter + "\"\n"
816 command += " #else\n"
817 command += " " + actual_parameter + "\n"
818 command += " #end if\n"
819 command += "#end if\n"
820 elif is_boolean_parameter(param):
821 command += "#if " + actual_parameter + ":\n"
822 command += ' %s\n' % param_cli_name
823 command += "#end if\n"
824 elif TYPE_TO_GALAXY_TYPE[param.type] is 'text':
825 command += "#if " + actual_parameter + ":\n"
826 command += " %s " % param_cli_name
827 command += " \"" + actual_parameter + "\"\n"
828 command += "#end if\n"
829 else:
830 command += "#if " + actual_parameter + ":\n"
831 command += ' %s ' % param_cli_name
832 command += actual_parameter + "\n"
833 command += "#end if\n"
834
835 if param.advanced and param.type is not _OutFile:
836 advanced_command += " %s" % command
837 else:
838 final_command += command
839
840 if advanced_command:
841 final_command += "%s%s%s\n" % (advanced_command_start, advanced_command, advanced_command_end)
842
843 if not found_output_parameter:
844 final_command += "> $param_stdout\n"
845
846 command_node = add_child_node(tool, "command")
847 command_node.text = final_command
848
849
850 # creates the xml elements needed to import the needed macros files
851 # and to "expand" the macros
852 def expand_macros(tool, model, **kwargs):
853 macros_node = add_child_node(tool, "macros")
854 token_node = add_child_node(macros_node, "token")
855 token_node.attrib["name"] = "@EXECUTABLE@"
856 token_node.text = get_tool_executable_path(model, kwargs["default_executable_path"])
857
858 # add <import> nodes
859 for macro_file_name in kwargs["macros_file_names"]:
860 macro_file = open(macro_file_name)
861 import_node = add_child_node(macros_node, "import")
862 # do not add the path of the file, rather, just its basename
863 import_node.text = os.path.basename(macro_file.name)
864
865 # add <expand> nodes
866 for expand_macro in kwargs["macros_to_expand"]:
867 expand_node = add_child_node(tool, "expand")
868 expand_node.attrib["macro"] = expand_macro
869
870
871 def get_tool_executable_path(model, default_executable_path):
872 # rules to build the galaxy executable path:
873 # if executablePath is null, then use default_executable_path and store it in executablePath
874 # if executablePath is null and executableName is null, then the name of the tool will be used
875 # if executablePath is null and executableName is not null, then executableName will be used
876 # if executablePath is not null and executableName is null,
877 # then executablePath and the name of the tool will be used
878 # if executablePath is not null and executableName is not null, then both will be used
879
880 # first, check if the model has executablePath / executableName defined
881 executable_path = model.opt_attribs.get("executablePath", None)
882 executable_name = model.opt_attribs.get("executableName", None)
883
884 # check if we need to use the default_executable_path
885 if executable_path is None:
886 executable_path = default_executable_path
887
888 # fix the executablePath to make sure that there is a '/' in the end
889 if executable_path is not None:
890 executable_path = executable_path.strip()
891 if not executable_path.endswith('/'):
892 executable_path += '/'
893
894 # assume that we have all information present
895 command = str(executable_path) + str(executable_name)
896 if executable_path is None:
897 if executable_name is None:
898 command = model.name
899 else:
900 command = executable_name
901 else:
902 if executable_name is None:
903 command = executable_path + model.name
904 return command
905
906
907 def get_galaxy_parameter_name(param):
908 return "param_%s" % get_param_name(param).replace(':', '_').replace('-', '_')
909
910
911 def get_input_with_same_restrictions(out_param, model, supported_file_formats):
912 for param in extract_parameters(model):
913 if param.type is _InFile:
914 if param.restrictions is not None:
915 in_param_formats = get_supported_file_types(param.restrictions.formats, supported_file_formats)
916 out_param_formats = get_supported_file_types(out_param.restrictions.formats, supported_file_formats)
917 if in_param_formats == out_param_formats:
918 return param
919
920
921 def create_inputs(tool, model, **kwargs):
922 inputs_node = SubElement(tool, "inputs")
923
924 # some suites (such as OpenMS) need some advanced options when handling inputs
925 expand_advanced_node = add_child_node(tool, "expand", OrderedDict([("macro", ADVANCED_OPTIONS_MACRO_NAME)]))
926 parameter_hardcoder = kwargs["parameter_hardcoder"]
927
928 # treat all non output-file parameters as inputs
929 for param in extract_parameters(model):
930 # no need to show hardcoded parameters
931 hardcoded_value = parameter_hardcoder.get_hardcoded_value(param.name, model.name)
932 if param.name in kwargs["blacklisted_parameters"] or hardcoded_value:
933 # let's not use an extra level of indentation and use NOP
934 continue
935 if param.type is not _OutFile:
936 if param.advanced:
937 if expand_advanced_node is not None:
938 parent_node = expand_advanced_node
939 else:
940 # something went wrong... we are handling an advanced parameter and the
941 # advanced input macro was not set... inform the user about it
942 info("The parameter %s has been set as advanced, but advanced_input_macro has "
943 "not been set." % param.name, 1)
944 # there is not much we can do, other than use the inputs_node as a parent node!
945 parent_node = inputs_node
946 else:
947 parent_node = inputs_node
948
949 # for lists we need a repeat tag
950 if param.is_list and param.type is not _InFile:
951 rep_node = add_child_node(parent_node, "repeat")
952 create_repeat_attribute_list(rep_node, param)
953 parent_node = rep_node
954
955 param_node = add_child_node(parent_node, "param")
956 create_param_attribute_list(param_node, param, kwargs["supported_file_formats"])
957
958 # advanced parameter selection should be at the end
959 # and only available if an advanced parameter exists
960 if expand_advanced_node is not None and len(expand_advanced_node) > 0:
961 inputs_node.append(expand_advanced_node)
962
963
964 def get_repeat_galaxy_parameter_name(param):
965 return "rep_" + get_galaxy_parameter_name(param)
966
967
968 def create_repeat_attribute_list(rep_node, param):
969 rep_node.attrib["name"] = get_repeat_galaxy_parameter_name(param)
970 if param.required:
971 rep_node.attrib["min"] = "1"
972 else:
973 rep_node.attrib["min"] = "0"
974 # for the ITEMLISTs which have LISTITEM children we only
975 # need one parameter as it is given as a string
976 if param.default is not None:
977 rep_node.attrib["max"] = "1"
978