Advanced Data Flow
By default WarpTools directories for raw data and processed data are specified
in a .settings
text file created with the create_settings
tool.
This page is for advanced users who want to customise data flow in their WarpTools processing.
Concepts
Customising data flow using WarpTools requires understanding a few key concepts.
- raw data directory
- processing directory
Raw data directory
The raw data directory contains what WarpTools considers the raw data.
For frame series, raw data are movie files, usually .mrc
or .tif
.
For tilt series, raw data are .tomostar
files.
Processing directory
Outputs from all WarpTools programs will be written into the processing directory.
Per movie or per tilt series metadata will be written in .xml
files in the root of
the processing directory whilst images and other process specific data are written into
subdirectories of the processing directory with names like average
, matching
or
reconstruction
.
Redirecting the Flow
Now that we know the basics, we're ready to redirect the processing flow.
Here are the relevant options available in all WarpTools commands.
Options
--input_data
The --input_data
option overrides the list of input files specified in the .settings file.
It accepts a space-separated list of files, wildcard patterns, or .txt files with one file name per line.
--input_processing
--input_processing
specifies an alternative directory containing pre-processed results.
This overrides the processing directory specified in the .settings
file and affects
both file input and output.
--output_processing
--output_processing
specifies an alternative directory to save processing results.
This also overrides the processing directory in the .settings file, but only for file
output.
Examples
Let's go through some examples...
Generating Particle Stacks for Multiple Species
When WarpTools generates 2D or 3D particles from tilt series data it writes image files
into <processing_directory>/particleseries
and <processing_directory>/subtomo
respectively
with file names like TS_1_4.00A_000001.mrcs
.
If you're working on many different objects in your data and want to extract particles
at the same pixel size you risk overwriting previous particle sets. By specifying
--output_processing
, we redirect the output to a new directory which we will call
relion_15854
here.
WarpTools ts_export_particles \
--settings warp_tiltseries.settings \
--input_directory warp_tiltseries/matching \
--input_pattern "*15854_clean.star" \
--output_processing relion_15854 \
--output_angpix 4 \
--output_star relion_15854/matching_4apx.star \
--relative_output_paths \
--normalized_coords \
--box 96 \
--diameter 130 \
--2d
This produces a new directory relion_15854
the following structure
relion_15854
├── logs
├── particleseries
│ ├── TS_1
│ ├── TS_11
│ ├── TS_17
│ ├── TS_23
│ └── TS_32
├── dummy_tiltseries.mrc
├── matching_4apx_optimisation_set.star
├── matching_4apx.star
└── matching_4apx_tomograms.star
Running different tilt-series alignment programs
We can use the --output_processing
option as we did in the particle export example
to test different tilt series alignment methods.
We will need to add --input_processing
to our ts_reconstruct
call
if we want to reconstruct using these alignments.
Let's run both Etomo's patch tracking and AreTomo on some data
WarpTools ts_etomo_patches \
--settings warp_tiltseries.settings \
--angpix 10 \
--patch_size 1000 \
--do_axis_search \
--output_processing etomo_patches_1000
WarpTools ts_aretomo \
--settings warp_tiltseries.settings \
--angpix 10 \
--alignz 800 \
--axis_iter 5 \
--min_fov 0 \
--output_processing aretomo_alignz_800
To reconstruct each of these datasets, we will need to specify the --output_processing
directory as the --input_processing
directory for ts_reconstruct
.
WarpTools ts_reconstruct \
--settings warp_tiltseries.settings \
--input_processing aretomo_alignz_800 \
--angpix 10
Tip
If --output_processing
is not specified then output will be written
into the same directory. Output can be further redirected by specifying --output_processing
.
Some caveats
This mechanism is imperfect, if you try to run a process which depends on earlier results you might run into trouble. We're working on this 🙂