presto.Multiprocessing¶
Multiprocessing functions
-
class
presto.Multiprocessing.
SeqData
(id, data)¶ Bases:
object
Class defining sequence data objects for worker processes
-
id
¶ unique identifier
-
data
¶ single object or a list of data objects.
-
valid
¶ if True data is suitable for processing.
-
-
class
presto.Multiprocessing.
SeqResult
(id, data)¶ Bases:
object
Class defining sequence result objects for collector processes
-
id
¶ unique identifier
-
data
¶ single unprocessed object or a list of unprocessed data objects.
-
results
¶ single processed object or a list of processed data objects.
-
valid
¶ if True processing was successful.
-
log
¶ dictionary containing the processing log.
-
-
presto.Multiprocessing.
collectPairQueue
(alive, result_queue, collect_queue, seq_file_1, seq_file_2, label, out_file=None, out_args={'delimiter': ('|', '=', ', '), 'failed': True, 'log_file': None, 'out_dir': None, 'out_name': None, 'out_type': None, 'separator': ', '})¶ Pulls from results queue, assembles results and manages log and file IO
Parameters: - alive – a multiprocessing.Value boolean controlling whether processing continues; when False function returns.
- result_queue – a multiprocessing.Queue holding worker results.
- collect_queue – a multiprocessing.Queue holding collector return values.
- seq_file_1 – the first sequence file name.
- seq_file_2 – the second sequence file name.
- label – task label used to tag the output files.
- out_file – output file name. Automatically generated from the input file if None.
- out_args – common output argument dictionary from parseCommonArgs.
Returns: adds a dictionary of {log: log object, out_files: output file names} to collect_queue.
Return type:
-
presto.Multiprocessing.
collectSeqQueue
(alive, result_queue, collect_queue, seq_file, label, index_field=None, out_file=None, out_args={'delimiter': ('|', '=', ', '), 'failed': True, 'log_file': None, 'out_dir': None, 'out_name': None, 'out_type': None, 'separator': ', '})¶ Pulls from results queue, assembles results and manages log and file IO
Parameters: - alive – a multiprocessing.Value boolean controlling whether processing continues; when False function returns.
- result_queue – Multiprocessing.Queue holding worker results.
- collect_queue – Multiprocessing.Queue to store collector return values.
- seq_file – sample sequence file name.
- label – task label used to tag the output files.
- out_file – output file name. Automatically generated from the input file if None.
- out_args – Common output argument dictionary from parseCommonArgs.
- index_field – Field defining set membership for sequence sets if None data queue contained individual records.
Returns: - Adds a dictionary with key value pairs to collect_queue containing
’log’ defining a log object, ‘out_files’ defining the output file names
Return type:
-
presto.Multiprocessing.
feedPairQueue
(alive, data_queue, seq_file_1, seq_file_2, coord_type='presto', delimiter=('|', '=', ', '))¶ Feeds the data queue with sequence pairs for processQueue processes
Parameters: - alive – a multiprocessing.Value boolean controlling whether processing continues; when False function returns
- data_queue – an multiprocessing.Queue to hold data for processing
- seq_file_1 – the name of sequence file 1
- seq_file_2 – the name of sequence file 2
- coord_type – the sequence header format
- delimiter – a tuple of delimiters for (fields, values, value lists)
Returns: None
-
presto.Multiprocessing.
feedSeqQueue
(alive, data_queue, seq_file, index_func=None, index_args={})¶ Feeds the data queue with SeqRecord objects
Parameters: - alive – multiprocessing.Value boolean controlling whether processing continues; when False function returns
- data_queue – multiprocessing.Queue to hold data for processing
- seq_file – Sequence file to read input from
- index_func – Function to use to define sequence sets if None do not index sets and feed individual records
- index_args – Dictionary of arguments to pass to index_func
Returns: None
-
presto.Multiprocessing.
manageProcesses
(feed_func, work_func, collect_func, feed_args={}, work_args={}, collect_args={}, nproc=None, queue_size=None)¶ Manages feeder, worker and collector processes
Parameters: - feed_func (function) – Data Queue feeder function.
- work_func (function) – Worker function.
- collect_func (function) – Result Queue collector function.
- feed_args (dict) – Dictionary of arguments to pass to feed_func.
- work_args (dict) – Dictionary of arguments to pass to work_func.
- collect_args (dict) – Dictionary of arguments to pass to collect_func.
- nproc (int) – Number of processQueue processes; if None defaults to the number of CPUs
- queue_size (int) – Maximum size of the argument queue; if None defaults to 2*nproc
Returns: Dictionary of collector results
Return type:
-
presto.Multiprocessing.
processSeqQueue
(alive, data_queue, result_queue, process_func, process_args={})¶ Pulls from data queue, performs calculations, and feeds results queue
Parameters: - alive – multiprocessing.Value boolean controlling whether processing continues; when False function returns
- data_queue – multiprocessing.Queue holding data to process
- result_queue – multiprocessing.Queue to hold processed results
- process_func – function to use for processing sequences
- process_args – Dictionary of arguments to pass to process_func
Returns: None