Commit a4522ac3149be600e4c06616864584d688a73d27 - r-cran-doparallel

Update upstream source from tag 'upstream/1.0.15' Update to upstream version '1.0.15' with Debian dir 903c2a46fbcf238cd07c312b0128abbc0deb541c Andreas Tille 4 years ago

16 changed file(s) with 1181 addition(s) and 1184 deletion(s). Raw diff Collapse all Expand all

-10

DESCRIPTION less more

0	0	Package: doParallel
1	1	Type: Package
2	2	Title: Foreach Parallel Adaptor for the 'parallel' Package
3		Version: 1.0.14
4		Authors@R: c(person("Rich", "Calaway", role="cre", email="richcala@microsoft.com"),
	3	Version: 1.0.15
	4	Authors@R: c(person("Hong", "Ooi", role="cre", email="hongooi@microsoft.com"),
5	5	person("Microsoft", "Corporation", role=c("aut", "cph")),
6	6	person("Steve", "Weston", role="aut"),
7	7	person("Dan", "Tenenbaum", role="ctb"))

12	12	Suggests: caret, mlbench, rpart, RUnit
13	13	Enhances: compiler
14	14	License: GPL-2
15		Author: Rich Calaway [cre],
	15	NeedsCompilation: no
	16	Packaged: 2019-07-16 22:51:14 UTC; richcala
	17	Author: Hong Ooi [cre],
16	18	Microsoft Corporation [aut, cph],
17	19	Steve Weston [aut],
18	20	Dan Tenenbaum [ctb]
19		Maintainer: Rich Calaway <richcala@microsoft.com>
	21	Maintainer: Hong Ooi <hongooi@microsoft.com>
20	22	Repository: CRAN
21		Repository/R-Forge/Project: doparallel
22		Repository/R-Forge/Revision: 21
23		Repository/R-Forge/DateTimeStamp: 2018-09-21 22:41:46
24		Date/Publication: 2018-09-24 19:20:09 UTC
25		NeedsCompilation: no
26		Packaged: 2018-09-21 22:51:03 UTC; rforge
	23	Date/Publication: 2019-08-02 04:40:02 UTC

+15

-15

MD5 less more

0		9f1b3d50fb3be83255df9314d9e0cb3c *DESCRIPTION
1		16c1196e34ef2f64277123d8d53442f5 *NAMESPACE
2		0fed2a6f4bbf50ec3828379a80ef3618 *NEWS
	0	feb5e9178558c0670eb510c6ef12bda4 *DESCRIPTION
	1	fb79212b5b9dc6ece74fe33fd609ed7e *NAMESPACE
	2	c3e7257d3042dad506b9b9c01dc7c838 *NEWS
3	3	79fc414268c56bae63ad3081b50405dc *R/doParallel.R
4	4	86f0e4745e79399332a21f661de57bbb *R/zzz.R
5		090be37f676d0f00a7e16b77c601a7dd *build/vignette.rds
6		ad6e7aeda54fa895a60fd8c0c92a39bf *demo/00Index
7		acd97a961dc67743d9ae85b28aa8fec1 *demo/sincParallel.R
8		d1d107a8aed2c92fe6efa71cbc691831 *inst/doc/gettingstartedParallel.R
9		bf3cfed8a81605cf697c7e1e95bd856c *inst/doc/gettingstartedParallel.Rnw
10		2a012c168cfaa0d75aa1890391de3e35 *inst/doc/gettingstartedParallel.pdf
11		0a17c88eb4ddb5c75a71bd940627f1b1 *inst/examples/bootParallel.R
12		f2621d4a791a20471698dfe4ceb351eb *inst/unitTests/options.R
13		59ecbac80339ba8a55adc7ec51ced837 *inst/unitTests/runTestSuite.sh
14		127e4697324d014bdf67e3e3c9ddf80f *man/doParallel-package.Rd
15		8f2ff4e8944398c34a7add4667cec738 *man/registerDoParallel.Rd
	5	8d524fd555d56a6e715447fdcb9aa21f *build/vignette.rds
	6	0004a14592476b09378f53e6d915d419 *demo/00Index
	7	657dd86a2b23acaeb44861433300d2ca *demo/sincParallel.R
	8	674625575a46e398efb8b965cebf67da *inst/doc/gettingstartedParallel.R
	9	09654ec2bef8300c0ec42470cbe479e9 *inst/doc/gettingstartedParallel.Rnw
	10	0a965581b17ab4c0f431abcd7bcb94ca *inst/doc/gettingstartedParallel.pdf
	11	b278debda756976016cf4fd810817a5d *inst/examples/bootParallel.R
	12	9370fa8163f85d43e94b220a109cf32f *inst/unitTests/options.R
	13	530a76cc5343e76d39c5aa6e2a469fba *inst/unitTests/runTestSuite.sh
	14	0288e366be373ba1c6378cba2217e797 *man/doParallel-package.Rd
	15	e32ac0edf3d2b9cf40b27c65c55c046a *man/registerDoParallel.Rd
16	16	c839b703f8dc3cb5c79d48385effe11c *tests/doRUnit.R
17		bf3cfed8a81605cf697c7e1e95bd856c *vignettes/gettingstartedParallel.Rnw
	17	09654ec2bef8300c0ec42470cbe479e9 *vignettes/gettingstartedParallel.Rnw

-7

NAMESPACE less more

0		export(registerDoParallel)
1		export(stopImplicitCluster)
2		importFrom("utils", "packageDescription", "packageName")
3		import(foreach)
4		import(iterators)
5		import(parallel)
6
	0	export(registerDoParallel)
	1	export(stopImplicitCluster)
	2	importFrom("utils", "packageDescription", "packageName")
	3	import(foreach)
	4	import(iterators)
	5	import(parallel)
	6

+43

-43

NEWS less more

0		NEWS/ChangeLog for doParallel
1		-----------------------------
2
3		1.0.14 2018-09-24
4		o Re-enabled tests.
5		o Moved RUnit from Enhances to Suggests (request of Kurt Hornik)
6
7		1.0.13 2018-04-04
8		o Changes to support enhanced exports via future (if available).
9
10		1.0.12 2017-12-08
11		o Change test report path for compliance with CRAN policies.
12
13		1.0.9 2015-09-21
14		o Bug fixes to stopImplicitCluster functionality, courtesy of Dan Tenenbaum.
15
16		1.0.8 2014-02-25
17		o Modified vignette to use no more than two workers.
18
19		1.0.7 2014-02-01
20		o Modified to work better when a foreach loop is executed
21		in a package (courtesy of Steve Weston)
22		o Added unit tests and a minimal working example
23
24		1.0.6 2013-10-25
25		o Changed foreach, iterators, and parallel from Depends to
26		Imports (request of Steve Weston and Stefan Schlager)
27
28		1.0.4 2013-09-01
29		o New attachExportEnv option for doParallelSNOW
30		o New function stopImplicitCluster to stop the implicitly created
31		socket cluster.
32		o Updated inst/unitTests/runTestSuite.sh, bug report from Michael Cheng
33
34		1.0.3 2013-06-06
35		o New preschedule option for doParallelSNOW, courtesy of Steve Weston
36		o Removed assignment into global environment to meet CRAN standards.
37
38		1.0.2 2013-05-29
39		o Efficiency improvements courtesy of Steve Weston
40
41		1.0.1 2012-04-09
42		o Updated to support RevoScaleR's rxExec function
	0	NEWS/ChangeLog for doParallel
	1	-----------------------------
	2
	3	1.0.14 2018-09-24
	4	o Re-enabled tests.
	5	o Moved RUnit from Enhances to Suggests (request of Kurt Hornik)
	6
	7	1.0.13 2018-04-04
	8	o Changes to support enhanced exports via future (if available).
	9
	10	1.0.12 2017-12-08
	11	o Change test report path for compliance with CRAN policies.
	12
	13	1.0.9 2015-09-21
	14	o Bug fixes to stopImplicitCluster functionality, courtesy of Dan Tenenbaum.
	15
	16	1.0.8 2014-02-25
	17	o Modified vignette to use no more than two workers.
	18
	19	1.0.7 2014-02-01
	20	o Modified to work better when a foreach loop is executed
	21	in a package (courtesy of Steve Weston)
	22	o Added unit tests and a minimal working example
	23
	24	1.0.6 2013-10-25
	25	o Changed foreach, iterators, and parallel from Depends to
	26	Imports (request of Steve Weston and Stefan Schlager)
	27
	28	1.0.4 2013-09-01
	29	o New attachExportEnv option for doParallelSNOW
	30	o New function stopImplicitCluster to stop the implicitly created
	31	socket cluster.
	32	o Updated inst/unitTests/runTestSuite.sh, bug report from Michael Cheng
	33
	34	1.0.3 2013-06-06
	35	o New preschedule option for doParallelSNOW, courtesy of Steve Weston
	36	o Removed assignment into global environment to meet CRAN standards.
	37
	38	1.0.2 2013-05-29
	39	o Efficiency improvements courtesy of Steve Weston
	40
	41	1.0.1 2012-04-09
	42	o Updated to support RevoScaleR's rxExec function

build/vignette.rds less more

Binary diff not shown

-1

demo/00Index less more

0		sincParallel computation of the sinc function
	0	sincParallel computation of the sinc function

+37

-37

demo/sincParallel.R less more

0		library(doParallel)
1		registerDoParallel()
2
3		# Define a function that creates an iterator that returns subvectors
4		ivector <- function(x, chunks) {
5		n <- length(x)
6		i <- 1
7
8		nextEl <- function() {
9		if (chunks <= 0 \|\| n <= 0) stop('StopIteration')
10		m <- ceiling(n / chunks)
11		r <- seq(i, length=m)
12		i <<- i + m
13		n <<- n - m
14		chunks <<- chunks - 1
15		x[r]
16		}
17
18		obj <- list(nextElem=nextEl)
19		class(obj) <- c('abstractiter', 'iter')
20		obj
21		}
22
23		# Define the coordinate grid and figure out how to split up the work
24		x <- seq(-10, 10, by=0.1)
25		nw <- getDoParWorkers()
26		cat(sprintf('Running with %d worker(s)\n', nw))
27
28		# Compute the value of the sinc function at each point in the grid
29		z <- foreach(y=ivector(x, nw), .combine=cbind) %dopar% {
30		y <- rep(y, each=length(x))
31		r <- sqrt(x ^ 2 + y ^ 2)
32		matrix(10 * sin(r) / r, length(x))
33		}
34
35		# Plot the results as a perspective plot
36		persp(x, x, z, ylab='y', theta=30, phi=30, expand=0.5, col="lightblue")
	0	library(doParallel)
	1	registerDoParallel()
	2
	3	# Define a function that creates an iterator that returns subvectors
	4	ivector <- function(x, chunks) {
	5	n <- length(x)
	6	i <- 1
	7
	8	nextEl <- function() {
	9	if (chunks <= 0 \|\| n <= 0) stop('StopIteration')
	10	m <- ceiling(n / chunks)
	11	r <- seq(i, length=m)
	12	i <<- i + m
	13	n <<- n - m
	14	chunks <<- chunks - 1
	15	x[r]
	16	}
	17
	18	obj <- list(nextElem=nextEl)
	19	class(obj) <- c('abstractiter', 'iter')
	20	obj
	21	}
	22
	23	# Define the coordinate grid and figure out how to split up the work
	24	x <- seq(-10, 10, by=0.1)
	25	nw <- getDoParWorkers()
	26	cat(sprintf('Running with %d worker(s)\n', nw))
	27
	28	# Compute the value of the sinc function at each point in the grid
	29	z <- foreach(y=ivector(x, nw), .combine=cbind) %dopar% {
	30	y <- rep(y, each=length(x))
	31	r <- sqrt(x ^ 2 + y ^ 2)
	32	matrix(10 * sin(r) / r, length(x))
	33	}
	34
	35	# Plot the results as a perspective plot
	36	persp(x, x, z, ylab='y', theta=30, phi=30, expand=0.5, col="lightblue")

+73

-73

inst/doc/gettingstartedParallel.R less more

0		### R code from vignette source 'gettingstartedParallel.Rnw'
1
2		###################################################
3		### code chunk number 1: loadLibs
4		###################################################
5		library(doParallel)
6		cl <- makeCluster(2)
7		registerDoParallel(cl)
8		foreach(i=1:3) %dopar% sqrt(i)
9
10
11		###################################################
12		### code chunk number 2: gettingstartedParallel.Rnw:149-150
13		###################################################
14		stopCluster(cl)
15
16
17		###################################################
18		### code chunk number 3: gettingstartedParallel.Rnw:193-196
19		###################################################
20		library(doParallel)
21		cl <- makeCluster(2)
22		registerDoParallel(cl)
23
24
25		###################################################
26		### code chunk number 4: bootpar
27		###################################################
28		x <- iris[which(iris[,5] != "setosa"), c(1,5)]
29		trials <- 10000
30
31		ptime <- system.time({
32		r <- foreach(icount(trials), .combine=cbind) %dopar% {
33		ind <- sample(100, 100, replace=TRUE)
34		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
35		coefficients(result1)
36		}
37		})[3]
38		ptime
39
40
41		###################################################
42		### code chunk number 5: bootseq
43		###################################################
44		stime <- system.time({
45		r <- foreach(icount(trials), .combine=cbind) %do% {
46		ind <- sample(100, 100, replace=TRUE)
47		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
48		coefficients(result1)
49		}
50		})[3]
51		stime
52
53
54		###################################################
55		### code chunk number 6: getDoParWorkers
56		###################################################
57		getDoParWorkers()
58
59
60		###################################################
61		### code chunk number 7: getDoParName
62		###################################################
63		getDoParName()
64		getDoParVersion()
65
66
67		###################################################
68		### code chunk number 8: gettingstartedParallel.Rnw:274-275
69		###################################################
70		stopCluster(cl)
71
72
	0	### R code from vignette source 'gettingstartedParallel.Rnw'
	1
	2	###################################################
	3	### code chunk number 1: loadLibs
	4	###################################################
	5	library(doParallel)
	6	cl <- makeCluster(2)
	7	registerDoParallel(cl)
	8	foreach(i=1:3) %dopar% sqrt(i)
	9
	10
	11	###################################################
	12	### code chunk number 2: gettingstartedParallel.Rnw:149-150
	13	###################################################
	14	stopCluster(cl)
	15
	16
	17	###################################################
	18	### code chunk number 3: gettingstartedParallel.Rnw:193-196
	19	###################################################
	20	library(doParallel)
	21	cl <- makeCluster(2)
	22	registerDoParallel(cl)
	23
	24
	25	###################################################
	26	### code chunk number 4: bootpar
	27	###################################################
	28	x <- iris[which(iris[,5] != "setosa"), c(1,5)]
	29	trials <- 10000
	30
	31	ptime <- system.time({
	32	r <- foreach(icount(trials), .combine=cbind) %dopar% {
	33	ind <- sample(100, 100, replace=TRUE)
	34	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	35	coefficients(result1)
	36	}
	37	})[3]
	38	ptime
	39
	40
	41	###################################################
	42	### code chunk number 5: bootseq
	43	###################################################
	44	stime <- system.time({
	45	r <- foreach(icount(trials), .combine=cbind) %do% {
	46	ind <- sample(100, 100, replace=TRUE)
	47	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	48	coefficients(result1)
	49	}
	50	})[3]
	51	stime
	52
	53
	54	###################################################
	55	### code chunk number 6: getDoParWorkers
	56	###################################################
	57	getDoParWorkers()
	58
	59
	60	###################################################
	61	### code chunk number 7: getDoParName
	62	###################################################
	63	getDoParName()
	64	getDoParVersion()
	65
	66
	67	###################################################
	68	### code chunk number 8: gettingstartedParallel.Rnw:274-275
	69	###################################################
	70	stopCluster(cl)
	71
	72

+344

-344

inst/doc/gettingstartedParallel.Rnw less more

0		% \VignetteIndexEntry{Getting Started with doParallel and foreach}
1		% \VignetteDepends{doParallel}
2		% \VignetteDepends{foreach}
3		% \VignettePackage{doParallel}
4		\documentclass[12pt]{article}
5		\usepackage{amsmath}
6		\usepackage[pdftex]{graphicx}
7		\usepackage{color}
8		\usepackage{xspace}
9		\usepackage{url}
10		\usepackage{fancyvrb}
11		\usepackage{fancyhdr}
12		\usepackage[
13		colorlinks=true,
14		linkcolor=blue,
15		citecolor=blue,
16		urlcolor=blue]
17		{hyperref}
18		\usepackage{lscape}
19
20		\usepackage{Sweave}
21
22		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
23
24		% define new colors for use
25		\definecolor{darkgreen}{rgb}{0,0.6,0}
26		\definecolor{darkred}{rgb}{0.6,0.0,0}
27		\definecolor{lightbrown}{rgb}{1,0.9,0.8}
28		\definecolor{brown}{rgb}{0.6,0.3,0.3}
29		\definecolor{darkblue}{rgb}{0,0,0.8}
30		\definecolor{darkmagenta}{rgb}{0.5,0,0.5}
31
32		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
33
34		\newcommand{\bld}[1]{\mbox{\boldmath $#1$}}
35		\newcommand{\shell}[1]{\mbox{$#1$}}
36		\renewcommand{\vec}[1]{\mbox{\bf {#1}}}
37
38		\newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
39		\newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
40
41		\newcommand{\halfs}{\frac{1}{2}}
42
43		\setlength{\oddsidemargin}{-.25 truein}
44		\setlength{\evensidemargin}{0truein}
45		\setlength{\topmargin}{-0.2truein}
46		\setlength{\textwidth}{7 truein}
47		\setlength{\textheight}{8.5 truein}
48		\setlength{\parindent}{0.20truein}
49		\setlength{\parskip}{0.10truein}
50
51		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
52		\pagestyle{fancy}
53		\lhead{}
54		\chead{Getting Started with doParallel and foreach}
55		\rhead{}
56		\lfoot{}
57		\cfoot{}
58		\rfoot{\thepage}
59		\renewcommand{\headrulewidth}{1pt}
60		\renewcommand{\footrulewidth}{1pt}
61		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
62
63		\title{Getting Started with doParallel and foreach}
64		\author{Steve Weston\footnote{Steve Weston wrote the original version of this vignette for the doMC package. Rich Calaway
65		adapted the vignette for doParallel.} and Rich Calaway \\ doc@revolutionanalytics.com}
66
67
68		\begin{document}
69
70		\maketitle
71
72		\thispagestyle{empty}
73
74		\section{Introduction}
75
76		The \texttt{doParallel} package is a ``parallel backend'' for the
77		\texttt{foreach} package. It provides a mechanism needed to execute
78		\texttt{foreach} loops in parallel. The \texttt{foreach} package must
79		be used in conjunction with a package such as \texttt{doParallel} in order to
80		execute code in parallel. The user must register a parallel backend to
81		use, otherwise \texttt{foreach} will execute tasks sequentially, even
82		when the \%dopar\% operator is used.\footnote{\texttt{foreach} will
83		issue a warning that it is running sequentially if no parallel backend
84		has been registered. It will only issue this warning once, however.}
85
86		The \texttt{doParallel} package acts as an interface between \texttt{foreach}
87		and the \texttt{parallel} package of R 2.14.0 and later. The \texttt{parallel}
88		package is essentially a merger of the \texttt{multicore} package, which was
89		written by Simon Urbanek, and the \texttt{snow} package, which was written
90		by Luke Tierney and others. The \texttt{multicore} functionality supports
91		multiple workers only on those operating systems that
92		support the \texttt{fork} system call; this excludes Windows. By default,
93		\texttt{doParallel} uses \texttt{multicore} functionality on Unix-like
94		systems and \texttt{snow} functionality on Windows. Note that
95		the \texttt{multicore} functionality only runs tasks on a single
96		computer, not a cluster of computers. However, you can use the
97		\texttt{snow} functionality to execute on a cluster, using Unix-like
98		operating systems, Windows, or even a combination.
99		It is pointless to use \texttt{doParallel} and \texttt{parallel}
100		on a machine with only one processor with a single core. To get a speed
101		improvement, it must run on a machine with multiple processors, multiple
102		cores, or both.
103
104		\section{A word of caution}
105
106		Because the \texttt{parallel} package in \texttt{multicore} mode
107		starts its workers using
108		\texttt{fork} without doing a subsequent \texttt{exec}, it has some
109		limitations. Some operations cannot be performed properly by forked
110		processes. For example, connection objects very likely won't work.
111		In some cases, this could cause an object to become corrupted, and
112		the R session to crash.
113
114		\section{Registering the \texttt{doParallel} parallel backend}
115
116		To register \texttt{doParallel} to be used with \texttt{foreach}, you must
117		call the \texttt{registerDoParallel} function. If you call this with no
118		arguments, on Windows you will get three workers and on Unix-like
119		systems you will get a number of workers equal to approximately half the
120		number of cores on your system. You can also specify a cluster
121		(as created by the \texttt{makeCluster} function) or a number of cores.
122		The \texttt{cores} argument specifies the number of worker
123		processes that \texttt{doParallel} will use to execute tasks, which will
124		by default be
125		equal to one-half the total number of cores on the machine. You don't need to
126		specify a value for it, however. By default, \texttt{doParallel} will use the
127		value of the ``cores'' option, as specified with
128		the standard ``options'' function. If that isn't set, then
129		\texttt{doParallel} will try to detect the number of cores, and use one-half
130		that many workers.
131
132		Remember: unless \texttt{registerDoMC} is called, \texttt{foreach} will
133		{\em not} run in parallel. Simply loading the \texttt{doParallel} package is
134		not enough.
135
136		\section{An example \texttt{doParallel} session}
137
138		Before we go any further, let's load \texttt{doParallel}, register it, and use
139		it with \texttt{foreach}. We will use \texttt{snow}-like functionality in this
140		vignette, so we start by loading the package and starting a cluster:
141
142		<<loadLibs>>=
143		library(doParallel)
144		cl <- makeCluster(2)
145		registerDoParallel(cl)
146		foreach(i=1:3) %dopar% sqrt(i)
147		@
148		<<echo=FALSE>>=
149		stopCluster(cl)
150		@
151
152		To use \texttt{multicore}-like functionality, we would specify the number
153		of cores to use instead (but note that on Windows, attempting to use more
154		than one core with \texttt{parallel} results in an error):
155		\begin{verbatim}
156		library(doParallel}
157		registerDoParallel(cores=2)
158		foreach(i=1:3) %dopar% sqrt(i)
159		\end{verbatim}
160
161		\begin{quote}
162		Note well that this is {\em not} a practical use of \texttt{doParallel}. This
163		is our ``Hello, world'' program for parallel computing. It tests that
164		everything is installed and set up properly, but don't expect it to run
165		faster than a sequential \texttt{for} loop, because it won't!
166		\texttt{sqrt} executes far too quickly to be worth executing in
167		parallel, even with a large number of iterations. With small tasks, the
168		overhead of scheduling the task and returning the result can be greater
169		than the time to execute the task itself, resulting in poor performance.
170		In addition, this example doesn't make use of the vector capabilities of
171		\texttt{sqrt}, which it must to get decent performance. This is just a
172		test and a pedagogical example, {\em not} a benchmark.
173		\end{quote}
174
175		But returning to the point of this example, you can see that it is very
176		simple to load \texttt{doParallel} with all of its dependencies
177		(\texttt{foreach}, \texttt{iterators}, \texttt{parallel}, etc), and to
178		register it. For the rest of the R session, whenever you execute
179		\texttt{foreach} with \texttt{\%dopar\%}, the tasks will be executed
180		using \texttt{doParallel} and \texttt{parallel}. Note that you can register
181		a different parallel backend later, or deregister \texttt{doParallel} by
182		registering the sequential backend by calling the \texttt{registerDoSEQ}
183		function.
184
185		\section{A more serious example}
186
187		Now that we've gotten our feet wet, let's do something a bit less
188		trivial. One good example is bootstrapping. Let's see how long it
189		takes to run 10,000 bootstrap iterations in parallel on
190		\Sexpr{getDoParWorkers()} cores:
191
192		<<echo=FALSE>>=
193		library(doParallel)
194		cl <- makeCluster(2)
195		registerDoParallel(cl)
196		@
197		<<bootpar>>=
198		x <- iris[which(iris[,5] != "setosa"), c(1,5)]
199		trials <- 10000
200
201		ptime <- system.time({
202		r <- foreach(icount(trials), .combine=cbind) %dopar% {
203		ind <- sample(100, 100, replace=TRUE)
204		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
205		coefficients(result1)
206		}
207		})[3]
208		ptime
209		@
210
211		Using \texttt{doParallel} and \texttt{parallel} we were able to perform
212		10,000 bootstrap iterations in \Sexpr{ptime} seconds on
213		\Sexpr{getDoParWorkers()} cores. By changing the \texttt{\%dopar\%} to
214		\texttt{\%do\%}, we can run the same code sequentially to determine the
215		performance improvement:
216
217		<<bootseq>>=
218		stime <- system.time({
219		r <- foreach(icount(trials), .combine=cbind) %do% {
220		ind <- sample(100, 100, replace=TRUE)
221		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
222		coefficients(result1)
223		}
224		})[3]
225		stime
226		@
227
228
229		The sequential version ran in \Sexpr{stime} seconds, which means the
230		speed up is about \Sexpr{round(stime / ptime, digits=1)} on
231		\Sexpr{getDoParWorkers()} workers.\footnote{If you build this vignette
232		yourself, you can see how well this problem runs on your hardware. None
233		of the times are hardcoded in this document. You can also run the same
234		example which is in the examples directory of the \texttt{doParallel}
235		distribution.} Ideally, the speed up would be \Sexpr{getDoParWorkers()},
236		but no multicore CPUs are ideal, and neither are the operating systems
237		and software that run on them.
238
239		At any rate, this is a more realistic example that is worth executing in
240		parallel. We do not explain what it's doing or how it works
241		here. We just want to give you something more substantial than the
242		\texttt{sqrt} example in case you want to run some benchmarks yourself.
243		You can also run this example on a cluster by simply reregistering
244		with a cluster object that specifies the nodes to use. (See the
245		\texttt{makeCluster} help file for more details.)
246
247		\section{Getting information about the parallel backend}
248
249		To find out how many workers \texttt{foreach} is going to use, you can
250		use the \texttt{getDoParWorkers} function:
251
252		<<getDoParWorkers>>=
253		getDoParWorkers()
254		@
255
256		This is a useful sanity check that you're actually running in parallel.
257		If you haven't registered a parallel backend, or if your machine only
258		has one core, \texttt{getDoParWorkers} will return one. In either case,
259		don't expect a speed improvement. \texttt{foreach} is clever, but it
260		isn't magic.
261
262		The \texttt{getDoParWorkers} function is also useful when you want the
263		number of tasks to be equal to the number of workers. You may want to
264		pass this value to an iterator constructor, for example.
265
266		You can also get the name and version of the currently registered
267		backend:
268
269		<<getDoParName>>=
270		getDoParName()
271		getDoParVersion()
272		@
273		<<echo=FALSE>>=
274		stopCluster(cl)
275		@
276		This is mostly useful for documentation purposes, or for checking that
277		you have the most recent version of \texttt{doParallel}.
278
279		\section{Specifying multicore options}
280
281		When using \texttt{multicore}-like functionality, the \texttt{doParallel} package allows
282		you to specify various options when
283		running \texttt{foreach} that are supported by the underlying
284		\texttt{mclapply} function: ``preschedule'', ``set.seed'', ``silent'',
285		and ``cores''. You can learn about these options from the
286		\texttt{mclapply} man page. They are set using the \texttt{foreach}
287		\texttt{.options.multicore} argument. Here's an example of how to do
288		that:
289
290		\begin{verbatim}
291		mcoptions <- list(preschedule=FALSE, set.seed=FALSE)
292		foreach(i=1:3, .options.multicore=mcoptions) %dopar% sqrt(i)
293		\end{verbatim}
294
295		The ``cores'' options allows you to temporarily override the number of
296		workers to use for a single \texttt{foreach} operation. This is more
297		convenient than having to re-register \texttt{doParallel}. Although if no
298		value of ``cores'' was specified when \texttt{doParallel} was registered, you
299		can also change this value dynamically using the \texttt{options}
300		function:
301
302		\begin{verbatim}
303		options(cores=2)
304		getDoParWorkers()
305		options(cores=3)
306		getDoParWorkers()
307		\end{verbatim}
308
309		If you did specify the number of cores when registering \texttt{doParallel},
310		the ``cores'' option is ignored:
311
312		\begin{verbatim}
313		registerDoParallel(4)
314		options(cores=2)
315		getDoParWorkers()
316		\end{verbatim}
317
318		As you can see, there are a number of options for controlling the number
319		of workers to use with \texttt{parallel}, but the default behaviour
320		usually does what you want.
321
322		\section{Stopping your cluster}
323
324		If you are using \texttt{snow}-like functionality, you will want to stop your
325		cluster when you are done using it. The \texttt{doParallel} package's
326		\texttt{.onUnload} function will do this automatically if the cluster was created
327		automatically by \texttt{registerDoParallel}, but if you created the cluster manually
328		you should stop it using the \texttt{stopCluster} function:
329
330		\begin{verbatim}
331		stopCluster(cl)
332		\end{verbatim}
333
334		\section{Conclusion}
335
336		The \texttt{doParallel} and \texttt{parallel} packages provide a nice,
337		efficient parallel programming platform for multiprocessor/multicore
338		computers running operating systems such as Linux and Mac OS X. It is
339		very easy to install, and very easy to use. In short order, an average
340		R programmer can start executing parallel programs, without any previous
341		experience in parallel computing.
342
343		\end{document}
	0	% \VignetteIndexEntry{Getting Started with doParallel and foreach}
	1	% \VignetteDepends{doParallel}
	2	% \VignetteDepends{foreach}
	3	% \VignettePackage{doParallel}
	4	\documentclass[12pt]{article}
	5	\usepackage{amsmath}
	6	\usepackage[pdftex]{graphicx}
	7	\usepackage{color}
	8	\usepackage{xspace}
	9	\usepackage{url}
	10	\usepackage{fancyvrb}
	11	\usepackage{fancyhdr}
	12	\usepackage[
	13	colorlinks=true,
	14	linkcolor=blue,
	15	citecolor=blue,
	16	urlcolor=blue]
	17	{hyperref}
	18	\usepackage{lscape}
	19
	20	\usepackage{Sweave}
	21
	22	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	23
	24	% define new colors for use
	25	\definecolor{darkgreen}{rgb}{0,0.6,0}
	26	\definecolor{darkred}{rgb}{0.6,0.0,0}
	27	\definecolor{lightbrown}{rgb}{1,0.9,0.8}
	28	\definecolor{brown}{rgb}{0.6,0.3,0.3}
	29	\definecolor{darkblue}{rgb}{0,0,0.8}
	30	\definecolor{darkmagenta}{rgb}{0.5,0,0.5}
	31
	32	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	33
	34	\newcommand{\bld}[1]{\mbox{\boldmath $#1$}}
	35	\newcommand{\shell}[1]{\mbox{$#1$}}
	36	\renewcommand{\vec}[1]{\mbox{\bf {#1}}}
	37
	38	\newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
	39	\newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
	40
	41	\newcommand{\halfs}{\frac{1}{2}}
	42
	43	\setlength{\oddsidemargin}{-.25 truein}
	44	\setlength{\evensidemargin}{0truein}
	45	\setlength{\topmargin}{-0.2truein}
	46	\setlength{\textwidth}{7 truein}
	47	\setlength{\textheight}{8.5 truein}
	48	\setlength{\parindent}{0.20truein}
	49	\setlength{\parskip}{0.10truein}
	50
	51	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	52	\pagestyle{fancy}
	53	\lhead{}
	54	\chead{Getting Started with doParallel and foreach}
	55	\rhead{}
	56	\lfoot{}
	57	\cfoot{}
	58	\rfoot{\thepage}
	59	\renewcommand{\headrulewidth}{1pt}
	60	\renewcommand{\footrulewidth}{1pt}
	61	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	62
	63	\title{Getting Started with doParallel and foreach}
	64	\author{Steve Weston\footnote{Steve Weston wrote the original version of this vignette for the doMC package. Rich Calaway
	65	adapted the vignette for doParallel.} and Rich Calaway}
	66
	67
	68	\begin{document}
	69
	70	\maketitle
	71
	72	\thispagestyle{empty}
	73
	74	\section{Introduction}
	75
	76	The \texttt{doParallel} package is a ``parallel backend'' for the
	77	\texttt{foreach} package. It provides a mechanism needed to execute
	78	\texttt{foreach} loops in parallel. The \texttt{foreach} package must
	79	be used in conjunction with a package such as \texttt{doParallel} in order to
	80	execute code in parallel. The user must register a parallel backend to
	81	use, otherwise \texttt{foreach} will execute tasks sequentially, even
	82	when the \%dopar\% operator is used.\footnote{\texttt{foreach} will
	83	issue a warning that it is running sequentially if no parallel backend
	84	has been registered. It will only issue this warning once, however.}
	85
	86	The \texttt{doParallel} package acts as an interface between \texttt{foreach}
	87	and the \texttt{parallel} package of R 2.14.0 and later. The \texttt{parallel}
	88	package is essentially a merger of the \texttt{multicore} package, which was
	89	written by Simon Urbanek, and the \texttt{snow} package, which was written
	90	by Luke Tierney and others. The \texttt{multicore} functionality supports
	91	multiple workers only on those operating systems that
	92	support the \texttt{fork} system call; this excludes Windows. By default,
	93	\texttt{doParallel} uses \texttt{multicore} functionality on Unix-like
	94	systems and \texttt{snow} functionality on Windows. Note that
	95	the \texttt{multicore} functionality only runs tasks on a single
	96	computer, not a cluster of computers. However, you can use the
	97	\texttt{snow} functionality to execute on a cluster, using Unix-like
	98	operating systems, Windows, or even a combination.
	99	It is pointless to use \texttt{doParallel} and \texttt{parallel}
	100	on a machine with only one processor with a single core. To get a speed
	101	improvement, it must run on a machine with multiple processors, multiple
	102	cores, or both.
	103
	104	\section{A word of caution}
	105
	106	Because the \texttt{parallel} package in \texttt{multicore} mode
	107	starts its workers using
	108	\texttt{fork} without doing a subsequent \texttt{exec}, it has some
	109	limitations. Some operations cannot be performed properly by forked
	110	processes. For example, connection objects very likely won't work.
	111	In some cases, this could cause an object to become corrupted, and
	112	the R session to crash.
	113
	114	\section{Registering the \texttt{doParallel} parallel backend}
	115
	116	To register \texttt{doParallel} to be used with \texttt{foreach}, you must
	117	call the \texttt{registerDoParallel} function. If you call this with no
	118	arguments, on Windows you will get three workers and on Unix-like
	119	systems you will get a number of workers equal to approximately half the
	120	number of cores on your system. You can also specify a cluster
	121	(as created by the \texttt{makeCluster} function) or a number of cores.
	122	The \texttt{cores} argument specifies the number of worker
	123	processes that \texttt{doParallel} will use to execute tasks, which will
	124	by default be
	125	equal to one-half the total number of cores on the machine. You don't need to
	126	specify a value for it, however. By default, \texttt{doParallel} will use the
	127	value of the ``cores'' option, as specified with
	128	the standard ``options'' function. If that isn't set, then
	129	\texttt{doParallel} will try to detect the number of cores, and use one-half
	130	that many workers.
	131
	132	Remember: unless \texttt{registerDoMC} is called, \texttt{foreach} will
	133	{\em not} run in parallel. Simply loading the \texttt{doParallel} package is
	134	not enough.
	135
	136	\section{An example \texttt{doParallel} session}
	137
	138	Before we go any further, let's load \texttt{doParallel}, register it, and use
	139	it with \texttt{foreach}. We will use \texttt{snow}-like functionality in this
	140	vignette, so we start by loading the package and starting a cluster:
	141
	142	<<loadLibs>>=
	143	library(doParallel)
	144	cl <- makeCluster(2)
	145	registerDoParallel(cl)
	146	foreach(i=1:3) %dopar% sqrt(i)
	147	@
	148	<<echo=FALSE>>=
	149	stopCluster(cl)
	150	@
	151
	152	To use \texttt{multicore}-like functionality, we would specify the number
	153	of cores to use instead (but note that on Windows, attempting to use more
	154	than one core with \texttt{parallel} results in an error):
	155	\begin{verbatim}
	156	library(doParallel)
	157	registerDoParallel(cores=2)
	158	foreach(i=1:3) %dopar% sqrt(i)
	159	\end{verbatim}
	160
	161	\begin{quote}
	162	Note well that this is {\em not} a practical use of \texttt{doParallel}. This
	163	is our ``Hello, world'' program for parallel computing. It tests that
	164	everything is installed and set up properly, but don't expect it to run
	165	faster than a sequential \texttt{for} loop, because it won't!
	166	\texttt{sqrt} executes far too quickly to be worth executing in
	167	parallel, even with a large number of iterations. With small tasks, the
	168	overhead of scheduling the task and returning the result can be greater
	169	than the time to execute the task itself, resulting in poor performance.
	170	In addition, this example doesn't make use of the vector capabilities of
	171	\texttt{sqrt}, which it must to get decent performance. This is just a
	172	test and a pedagogical example, {\em not} a benchmark.
	173	\end{quote}
	174
	175	But returning to the point of this example, you can see that it is very
	176	simple to load \texttt{doParallel} with all of its dependencies
	177	(\texttt{foreach}, \texttt{iterators}, \texttt{parallel}, etc), and to
	178	register it. For the rest of the R session, whenever you execute
	179	\texttt{foreach} with \texttt{\%dopar\%}, the tasks will be executed
	180	using \texttt{doParallel} and \texttt{parallel}. Note that you can register
	181	a different parallel backend later, or deregister \texttt{doParallel} by
	182	registering the sequential backend by calling the \texttt{registerDoSEQ}
	183	function.
	184
	185	\section{A more serious example}
	186
	187	Now that we've gotten our feet wet, let's do something a bit less
	188	trivial. One good example is bootstrapping. Let's see how long it
	189	takes to run 10,000 bootstrap iterations in parallel on
	190	\Sexpr{getDoParWorkers()} cores:
	191
	192	<<echo=FALSE>>=
	193	library(doParallel)
	194	cl <- makeCluster(2)
	195	registerDoParallel(cl)
	196	@
	197	<<bootpar>>=
	198	x <- iris[which(iris[,5] != "setosa"), c(1,5)]
	199	trials <- 10000
	200
	201	ptime <- system.time({
	202	r <- foreach(icount(trials), .combine=cbind) %dopar% {
	203	ind <- sample(100, 100, replace=TRUE)
	204	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	205	coefficients(result1)
	206	}
	207	})[3]
	208	ptime
	209	@
	210
	211	Using \texttt{doParallel} and \texttt{parallel} we were able to perform
	212	10,000 bootstrap iterations in \Sexpr{ptime} seconds on
	213	\Sexpr{getDoParWorkers()} cores. By changing the \texttt{\%dopar\%} to
	214	\texttt{\%do\%}, we can run the same code sequentially to determine the
	215	performance improvement:
	216
	217	<<bootseq>>=
	218	stime <- system.time({
	219	r <- foreach(icount(trials), .combine=cbind) %do% {
	220	ind <- sample(100, 100, replace=TRUE)
	221	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	222	coefficients(result1)
	223	}
	224	})[3]
	225	stime
	226	@
	227
	228
	229	The sequential version ran in \Sexpr{stime} seconds, which means the
	230	speed up is about \Sexpr{round(stime / ptime, digits=1)} on
	231	\Sexpr{getDoParWorkers()} workers.\footnote{If you build this vignette
	232	yourself, you can see how well this problem runs on your hardware. None
	233	of the times are hardcoded in this document. You can also run the same
	234	example which is in the examples directory of the \texttt{doParallel}
	235	distribution.} Ideally, the speed up would be \Sexpr{getDoParWorkers()},
	236	but no multicore CPUs are ideal, and neither are the operating systems
	237	and software that run on them.
	238
	239	At any rate, this is a more realistic example that is worth executing in
	240	parallel. We do not explain what it's doing or how it works
	241	here. We just want to give you something more substantial than the
	242	\texttt{sqrt} example in case you want to run some benchmarks yourself.
	243	You can also run this example on a cluster by simply reregistering
	244	with a cluster object that specifies the nodes to use. (See the
	245	\texttt{makeCluster} help file for more details.)
	246
	247	\section{Getting information about the parallel backend}
	248
	249	To find out how many workers \texttt{foreach} is going to use, you can
	250	use the \texttt{getDoParWorkers} function:
	251
	252	<<getDoParWorkers>>=
	253	getDoParWorkers()
	254	@
	255
	256	This is a useful sanity check that you're actually running in parallel.
	257	If you haven't registered a parallel backend, or if your machine only
	258	has one core, \texttt{getDoParWorkers} will return one. In either case,
	259	don't expect a speed improvement. \texttt{foreach} is clever, but it
	260	isn't magic.
	261
	262	The \texttt{getDoParWorkers} function is also useful when you want the
	263	number of tasks to be equal to the number of workers. You may want to
	264	pass this value to an iterator constructor, for example.
	265
	266	You can also get the name and version of the currently registered
	267	backend:
	268
	269	<<getDoParName>>=
	270	getDoParName()
	271	getDoParVersion()
	272	@
	273	<<echo=FALSE>>=
	274	stopCluster(cl)
	275	@
	276	This is mostly useful for documentation purposes, or for checking that
	277	you have the most recent version of \texttt{doParallel}.
	278
	279	\section{Specifying multicore options}
	280
	281	When using \texttt{multicore}-like functionality, the \texttt{doParallel} package allows
	282	you to specify various options when
	283	running \texttt{foreach} that are supported by the underlying
	284	\texttt{mclapply} function: ``preschedule'', ``set.seed'', ``silent'',
	285	and ``cores''. You can learn about these options from the
	286	\texttt{mclapply} man page. They are set using the \texttt{foreach}
	287	\texttt{.options.multicore} argument. Here's an example of how to do
	288	that:
	289
	290	\begin{verbatim}
	291	mcoptions <- list(preschedule=FALSE, set.seed=FALSE)
	292	foreach(i=1:3, .options.multicore=mcoptions) %dopar% sqrt(i)
	293	\end{verbatim}
	294
	295	The ``cores'' options allows you to temporarily override the number of
	296	workers to use for a single \texttt{foreach} operation. This is more
	297	convenient than having to re-register \texttt{doParallel}. Although if no
	298	value of ``cores'' was specified when \texttt{doParallel} was registered, you
	299	can also change this value dynamically using the \texttt{options}
	300	function:
	301
	302	\begin{verbatim}
	303	options(cores=2)
	304	getDoParWorkers()
	305	options(cores=3)
	306	getDoParWorkers()
	307	\end{verbatim}
	308
	309	If you did specify the number of cores when registering \texttt{doParallel},
	310	the ``cores'' option is ignored:
	311
	312	\begin{verbatim}
	313	registerDoParallel(4)
	314	options(cores=2)
	315	getDoParWorkers()
	316	\end{verbatim}
	317
	318	As you can see, there are a number of options for controlling the number
	319	of workers to use with \texttt{parallel}, but the default behaviour
	320	usually does what you want.
	321
	322	\section{Stopping your cluster}
	323
	324	If you are using \texttt{snow}-like functionality, you will want to stop your
	325	cluster when you are done using it. The \texttt{doParallel} package's
	326	\texttt{.onUnload} function will do this automatically if the cluster was created
	327	automatically by \texttt{registerDoParallel}, but if you created the cluster manually
	328	you should stop it using the \texttt{stopCluster} function:
	329
	330	\begin{verbatim}
	331	stopCluster(cl)
	332	\end{verbatim}
	333
	334	\section{Conclusion}
	335
	336	The \texttt{doParallel} and \texttt{parallel} packages provide a nice,
	337	efficient parallel programming platform for multiprocessor/multicore
	338	computers running operating systems such as Linux and Mac OS X. It is
	339	very easy to install, and very easy to use. In short order, an average
	340	R programmer can start executing parallel programs, without any previous
	341	experience in parallel computing.
	342
	343	\end{document}

inst/doc/gettingstartedParallel.pdf less more

Binary diff not shown

+83

-83

inst/examples/bootParallel.R less more

0		suppressMessages(library(doParallel))
1		cl <- makePSOCKcluster(4)
2		registerDoParallel(cl)
3
4		cat(sprintf('doParallel %s\n', packageVersion('doParallel')))
5		junk <- matrix(0, 1000000, 8)
6		cat(sprintf('Size of extra junk data: %d bytes\n', object.size(junk)))
7
8		x <- iris[which(iris[,5] != "setosa"), c(1,5)]
9
10		trials <- 10000
11
12		ptime <- system.time({
13		r <- foreach(icount(trials), .combine=cbind,
14		.export='junk') %dopar% {
15		ind <- sample(100, 100, replace=TRUE)
16		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
17		coefficients(result1)
18		}
19		})[3]
20		cat(sprintf('parallel foreach: %6.1f sec\n', ptime))
21
22		ptime2 <- system.time({
23		snowopts <- list(preschedule=TRUE)
24		r <- foreach(icount(trials), .combine=cbind,
25		.export='junk', .options.snow=snowopts) %dopar% {
26		ind <- sample(100, 100, replace=TRUE)
27		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
28		coefficients(result1)
29		}
30		})[3]
31		cat(sprintf('parallel foreach with prescheduling: %6.1f sec\n', ptime2))
32
33
34		ptime3 <- system.time({
35		chunks <- getDoParWorkers()
36		r <- foreach(n=idiv(trials, chunks=chunks), .combine=cbind,
37		.export='junk') %dopar% {
38		y <- lapply(seq_len(n), function(i) {
39		ind <- sample(100, 100, replace=TRUE)
40		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
41		coefficients(result1)
42		})
43		do.call('cbind', y)
44		}
45		})[3]
46		cat(sprintf('chunked parallel foreach: %6.1f sec\n', ptime3))
47
48		ptime4 <- system.time({
49		mkworker <- function(x, junk) {
50		force(x)
51		force(junk)
52		function(i) {
53		ind <- sample(100, 100, replace=TRUE)
54		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
55		coefficients(result1)
56		}
57		}
58		y <- parLapply(cl, seq_len(trials), mkworker(x, junk))
59		r <- do.call('cbind', y)
60		})[3]
61		cat(sprintf('parLapply: %6.1f sec\n', ptime4))
62
63		stime <- system.time({
64		y <- lapply(seq_len(trials), function(i) {
65		ind <- sample(100, 100, replace=TRUE)
66		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
67		coefficients(result1)
68		})
69		r <- do.call('cbind', y)
70		})[3]
71		cat(sprintf('sequential lapply: %6.1f sec\n', stime))
72
73		stime2 <- system.time({
74		r <- foreach(icount(trials), .combine=cbind) %do% {
75		ind <- sample(100, 100, replace=TRUE)
76		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
77		coefficients(result1)
78		}
79		})[3]
80		cat(sprintf('sequential foreach: %6.1f sec\n', stime2))
81
82		stopCluster(cl)
	0	suppressMessages(library(doParallel))
	1	cl <- makePSOCKcluster(4)
	2	registerDoParallel(cl)
	3
	4	cat(sprintf('doParallel %s\n', packageVersion('doParallel')))
	5	junk <- matrix(0, 1000000, 8)
	6	cat(sprintf('Size of extra junk data: %d bytes\n', object.size(junk)))
	7
	8	x <- iris[which(iris[,5] != "setosa"), c(1,5)]
	9
	10	trials <- 10000
	11
	12	ptime <- system.time({
	13	r <- foreach(icount(trials), .combine=cbind,
	14	.export='junk') %dopar% {
	15	ind <- sample(100, 100, replace=TRUE)
	16	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	17	coefficients(result1)
	18	}
	19	})[3]
	20	cat(sprintf('parallel foreach: %6.1f sec\n', ptime))
	21
	22	ptime2 <- system.time({
	23	snowopts <- list(preschedule=TRUE)
	24	r <- foreach(icount(trials), .combine=cbind,
	25	.export='junk', .options.snow=snowopts) %dopar% {
	26	ind <- sample(100, 100, replace=TRUE)
	27	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	28	coefficients(result1)
	29	}
	30	})[3]
	31	cat(sprintf('parallel foreach with prescheduling: %6.1f sec\n', ptime2))
	32
	33
	34	ptime3 <- system.time({
	35	chunks <- getDoParWorkers()
	36	r <- foreach(n=idiv(trials, chunks=chunks), .combine=cbind,
	37	.export='junk') %dopar% {
	38	y <- lapply(seq_len(n), function(i) {
	39	ind <- sample(100, 100, replace=TRUE)
	40	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	41	coefficients(result1)
	42	})
	43	do.call('cbind', y)
	44	}
	45	})[3]
	46	cat(sprintf('chunked parallel foreach: %6.1f sec\n', ptime3))
	47
	48	ptime4 <- system.time({
	49	mkworker <- function(x, junk) {
	50	force(x)
	51	force(junk)
	52	function(i) {
	53	ind <- sample(100, 100, replace=TRUE)
	54	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	55	coefficients(result1)
	56	}
	57	}
	58	y <- parLapply(cl, seq_len(trials), mkworker(x, junk))
	59	r <- do.call('cbind', y)
	60	})[3]
	61	cat(sprintf('parLapply: %6.1f sec\n', ptime4))
	62
	63	stime <- system.time({
	64	y <- lapply(seq_len(trials), function(i) {
	65	ind <- sample(100, 100, replace=TRUE)
	66	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	67	coefficients(result1)
	68	})
	69	r <- do.call('cbind', y)
	70	})[3]
	71	cat(sprintf('sequential lapply: %6.1f sec\n', stime))
	72
	73	stime2 <- system.time({
	74	r <- foreach(icount(trials), .combine=cbind) %do% {
	75	ind <- sample(100, 100, replace=TRUE)
	76	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	77	coefficients(result1)
	78	}
	79	})[3]
	80	cat(sprintf('sequential foreach: %6.1f sec\n', stime2))
	81
	82	stopCluster(cl)

+83

-83

inst/unitTests/options.R less more

0		test.preschedule <- function() {
1		x <- list(1:3, 1:9, 1:19)
2		cs <- 1:20
3		dpn <- getDoParName()
4
5		for (chunkSize in cs) {
6		## preschedule is TRUE for MC by default and
7		## FALSE for SNOW, so we test by setting them otherwise
8		if (identical(dpn, "doParallelMC")) {
9		opts <- list(preschedule=FALSE)
10		} else {
11		opts <- list(preschedule=TRUE)
12		}
13		for (y in x) {
14		if (identical(dpn, "doParallelMC")) {
15		actual <- foreach(i=y, .options.multicore=opts) %dopar% i
16		}
17		else {
18		actual <- foreach(i=y, .options.snow=opts) %dopar% i
19		}
20		checkEquals(actual, as.list(y))
21		if (identical(dpn, "doParallelMC")) {
22		actual <- foreach(i=y, .combine="c", .options.multicore=opts) %dopar% i
23		}
24		else {
25		actual <- foreach(i=y, .combine="c", .options.snow=opts) %dopar% i
26		}
27		checkEquals(actual, y)
28		}
29		}
30		}
31
32		test.attach <- function() {
33		if (identical(getDoParName(), "doParallelMC")) {
34		return(TRUE)
35		} else {
36		myFun <- function(x){
37		myFun1(x+1)
38		}
39		myFun1 <- function(x){
40		2*x
41		}
42		testFun <- function(){
43		inRes1 <- checkTrue("exportEnv" %in% search())
44		if (!inRes1) {
45		stop("Attaching exportEnv failed")
46		}
47		inRes2 <- checkTrue(exists("myFun1", where=2))
48		if (!inRes1) {
49		stop("myFun1 not found in exportEnv")
50		}
51		myFun(1)
52		}
53		res <- suppressWarnings(foreach(i=1:4, .combine="c", .packages="RUnit",
54		.export="myFun1", .options.snow=list(attachExportEnv=TRUE)) %dopar% testFun())
55
56		checkEquals(res, c(4,4, 4, 4))
57		}
58		}
59
60		pkgname.test.stress <- function() {
61		if (!require(caret, quietly=TRUE)) {
62		return(TRUE)
63		} else {
64		library(mlbench)
65		data(BostonHousing)
66
67		lmFit <- train(medv ~ . + rm:lstat,
68		data = BostonHousing,
69		"lm")
70
71		library(rpart)
72		rpartFit <- train(medv ~ .,
73		data = BostonHousing,
74		"rpart",
75		tuneLength = 9)
76		}
77		}
78
79		"test.pkgname.test.stress" <- function()
80		{
81		res <- try(pkgname.test.stress())
82		checkTrue(!is(res, "try-error"), msg="pkgname stress test failed")
	0	test.preschedule <- function() {
	1	x <- list(1:3, 1:9, 1:19)
	2	cs <- 1:20
	3	dpn <- getDoParName()
	4
	5	for (chunkSize in cs) {
	6	## preschedule is TRUE for MC by default and
	7	## FALSE for SNOW, so we test by setting them otherwise
	8	if (identical(dpn, "doParallelMC")) {
	9	opts <- list(preschedule=FALSE)
	10	} else {
	11	opts <- list(preschedule=TRUE)
	12	}
	13	for (y in x) {
	14	if (identical(dpn, "doParallelMC")) {
	15	actual <- foreach(i=y, .options.multicore=opts) %dopar% i
	16	}
	17	else {
	18	actual <- foreach(i=y, .options.snow=opts) %dopar% i
	19	}
	20	checkEquals(actual, as.list(y))
	21	if (identical(dpn, "doParallelMC")) {
	22	actual <- foreach(i=y, .combine="c", .options.multicore=opts) %dopar% i
	23	}
	24	else {
	25	actual <- foreach(i=y, .combine="c", .options.snow=opts) %dopar% i
	26	}
	27	checkEquals(actual, y)
	28	}
	29	}
	30	}
	31
	32	test.attach <- function() {
	33	if (identical(getDoParName(), "doParallelMC")) {
	34	return(TRUE)
	35	} else {
	36	myFun <- function(x){
	37	myFun1(x+1)
	38	}
	39	myFun1 <- function(x){
	40	2*x
	41	}
	42	testFun <- function(){
	43	inRes1 <- checkTrue("exportEnv" %in% search())
	44	if (!inRes1) {
	45	stop("Attaching exportEnv failed")
	46	}
	47	inRes2 <- checkTrue(exists("myFun1", where=2))
	48	if (!inRes1) {
	49	stop("myFun1 not found in exportEnv")
	50	}
	51	myFun(1)
	52	}
	53	res <- suppressWarnings(foreach(i=1:4, .combine="c", .packages="RUnit",
	54	.export="myFun1", .options.snow=list(attachExportEnv=TRUE)) %dopar% testFun())
	55
	56	checkEquals(res, c(4,4, 4, 4))
	57	}
	58	}
	59
	60	pkgname.test.stress <- function() {
	61	if (!require(caret, quietly=TRUE)) {
	62	return(TRUE)
	63	} else {
	64	library(mlbench)
	65	data(BostonHousing)
	66
	67	lmFit <- train(medv ~ . + rm:lstat,
	68	data = BostonHousing,
	69	"lm")
	70
	71	library(rpart)
	72	rpartFit <- train(medv ~ .,
	73	data = BostonHousing,
	74	"rpart",
	75	tuneLength = 9)
	76	}
	77	}
	78
	79	"test.pkgname.test.stress" <- function()
	80	{
	81	res <- try(pkgname.test.stress())
	82	checkTrue(!is(res, "try-error"), msg="pkgname stress test failed")
83	83	}⏎

+46

-46

inst/unitTests/runTestSuite.sh less more

0		#!/bin/sh
1
2		LOGFILE=test.log
3
4		R --vanilla --slave > ${LOGFILE} 2>&1 <<'EOF'
5		library(doParallel)
6		library(RUnit)
7
8		verbose <- as.logical(Sys.getenv('FOREACH_VERBOSE', 'FALSE'))
9
10		library(doParallel)
11		registerDoParallel()
12
13		options(warn=1)
14		options(showWarnCalls=TRUE)
15
16		cat('Starting test at', date(), '\n')
17		cat(sprintf('doParallel version: %s\n', getDoParVersion()))
18		cat(sprintf('Running with %d worker(s)\n', getDoParWorkers()))
19
20		tests <- c('options.R')
21
22		errcase <- list()
23		for (f in tests) {
24		cat('\nRunning test file:', f, '\n')
25		t <- runTestFile(f)
26		e <- getErrors(t)
27		if (e$nErr + e$nFail > 0) {
28		errcase <- c(errcase, t)
29		print(t)
30		}
31		}
32
33		if (length(errcase) == 0) {
34		cat('* Ran all tests successfully *\n')
35		} else {
36		cat('!!! Encountered', length(errcase), 'problems !!!\n')
37		for (t in errcase) {
38		print(t)
39		}
40		}
41
42		stopImplicitCluster()
43
44		cat('Finished test at', date(), '\n')
45		EOF
	0	#!/bin/sh
	1
	2	LOGFILE=test.log
	3
	4	R --vanilla --slave > ${LOGFILE} 2>&1 <<'EOF'
	5	library(doParallel)
	6	library(RUnit)
	7
	8	verbose <- as.logical(Sys.getenv('FOREACH_VERBOSE', 'FALSE'))
	9
	10	library(doParallel)
	11	registerDoParallel()
	12
	13	options(warn=1)
	14	options(showWarnCalls=TRUE)
	15
	16	cat('Starting test at', date(), '\n')
	17	cat(sprintf('doParallel version: %s\n', getDoParVersion()))
	18	cat(sprintf('Running with %d worker(s)\n', getDoParWorkers()))
	19
	20	tests <- c('options.R')
	21
	22	errcase <- list()
	23	for (f in tests) {
	24	cat('\nRunning test file:', f, '\n')
	25	t <- runTestFile(f)
	26	e <- getErrors(t)
	27	if (e$nErr + e$nFail > 0) {
	28	errcase <- c(errcase, t)
	29	print(t)
	30	}
	31	}
	32
	33	if (length(errcase) == 0) {
	34	cat('* Ran all tests successfully *\n')
	35	} else {
	36	cat('!!! Encountered', length(errcase), 'problems !!!\n')
	37	for (t in errcase) {
	38	print(t)
	39	}
	40	}
	41
	42	stopImplicitCluster()
	43
	44	cat('Finished test at', date(), '\n')
	45	EOF

+39

-39

man/doParallel-package.Rd less more

0		\name{doParallel-package}
1		\alias{doParallel-package}
2		\alias{doParallel}
3		\docType{package}
4		\title{
5		The doParallel Package
6		}
7		\description{
8		The doParallel package provides a parallel backend for the foreach/\%dopar\%
9		function using the \code{parallel} package of R 2.14.0 and later.
10		}
11		\details{
12		Further information is available in the following help topics:
13		\tabular{ll}{
14		\code{registerDoParallel} \tab register doParallel to be used by foreach/\%dopar\%\cr
15		}
16
17		To see a tutorial introduction to the doParallel package,
18		use \code{vignette("gettingstartedParallel")}. To see a tutorial
19		introduction to the foreach package, use \code{vignette("foreach")}.
20
21		To see a demo of doParallel computing the sinc function,
22		use \code{demo(sincParallel)}.
23
24		Some examples (in addition to those in the help pages) are included in
25		the ``examples'' directory of the doParallel package. To list the files in
26		the examples directory,
27		use \code{list.files(system.file("examples", package="doParallel"))}.
28		To run the bootstrap example, use
29		\code{source(system.file("examples", "bootParallel.R", package="doParallel"))}.
30		This is a simple benchmark, executing both sequentally and in parallel.
31		There are many more examples that come with the foreach package,
32		which will work with the doParallel package if it is registered as the
33		parallel backend.
34
35		For a complete list of functions with individual help pages,
36		use \code{library(help="doParallel")}.
37		}
38		\keyword{package}
	0	\name{doParallel-package}
	1	\alias{doParallel-package}
	2	\alias{doParallel}
	3	\docType{package}
	4	\title{
	5	The doParallel Package
	6	}
	7	\description{
	8	The doParallel package provides a parallel backend for the foreach/\%dopar\%
	9	function using the \code{parallel} package of R 2.14.0 and later.
	10	}
	11	\details{
	12	Further information is available in the following help topics:
	13	\tabular{ll}{
	14	\code{registerDoParallel} \tab register doParallel to be used by foreach/\%dopar\%\cr
	15	}
	16
	17	To see a tutorial introduction to the doParallel package,
	18	use \code{vignette("gettingstartedParallel")}. To see a tutorial
	19	introduction to the foreach package, use \code{vignette("foreach")}.
	20
	21	To see a demo of doParallel computing the sinc function,
	22	use \code{demo(sincParallel)}.
	23
	24	Some examples (in addition to those in the help pages) are included in
	25	the ``examples'' directory of the doParallel package. To list the files in
	26	the examples directory,
	27	use \code{list.files(system.file("examples", package="doParallel"))}.
	28	To run the bootstrap example, use
	29	\code{source(system.file("examples", "bootParallel.R", package="doParallel"))}.
	30	This is a simple benchmark, executing both sequentally and in parallel.
	31	There are many more examples that come with the foreach package,
	32	which will work with the doParallel package if it is registered as the
	33	parallel backend.
	34
	35	For a complete list of functions with individual help pages,
	36	use \code{library(help="doParallel")}.
	37	}
	38	\keyword{package}

+59

-59

man/registerDoParallel.Rd less more

0		\name{registerDoParallel}
1		\alias{registerDoParallel}
2		\alias{stopImplicitCluster}
3		\title{registerDoParallel}
4		\description{
5		The \code{registerDoParallel} function is used to register the
6		parallel backend with the \code{foreach} package.
7		}
8		\usage{
9		registerDoParallel(cl, cores=NULL, \dots)
10		stopImplicitCluster()
11		}
12		\arguments{
13		\item{cl}{A cluster object as returned by \code{makeCluster}, or the number
14		of nodes to be created in the cluster. If not specified, on Windows a
15		three worker cluster is created and used.}
16		\item{cores}{The number of cores to use for parallel execution. If not
17		specified, the number of cores is set to the value of
18		\code{options("cores")}, if specified, or to one-half the number of cores detected
19		by the \code{parallel} package.}
20		\item{\dots}{Package options. Currently, only the \code{nocompile} option
21		is supported. If \code{nocompile} is set to \code{TRUE}, compiler
22		support is disabled.}
23		}
24		\details{
25		The \code{parallel} package from R 2.14.0 and later provides functions for
26		parallel execution of R code on machines with multiple cores or processors
27		or multiple computers. It is essentially a blend of the \code{snow} and
28		\code{multicore} packages. By default, the \code{doParallel} package uses
29		\code{snow}-like functionality. The \code{snow}-like functionality
30		should work fine on Unix-like systems, but the \code{multicore}-like
31		functionality is limited to a single sequential worker on Windows systems.
32		On workstations with multiple cores running Unix-like operating systems,
33		the system \code{fork} call is used to spawn copies of the current process.
34
35		The \code{doParallel} backend supports both multicore and snow options passed
36		through the \code{foreach} function.
37		The supported multicore options are \code{preschedule}, \code{set.seed},
38		\code{silent}, and \code{cores}, which are analogous to the similarly named
39		arguments to \code{\link{mclapply}}, and are passed using the
40		\code{.options.multicore} argument to \code{foreach}. The supported snow options are
41		\code{preschedule}, which like its multicore analog can be used to chunk the
42		tasks so that each worker gets a prescheduled chunk of tasks, and
43		\code{attachExportEnv}, which can be used to attach the export environment
44		in certain cases where R's lexical scoping is unable to find a needed
45		export. The snow options are passed to \code{foreach} using the \code{.options.snow}
46		argument.
47
48		The function \code{stopImplicitCluster} can be used in vignettes and other places
49		where it is important to explicitly close the implicitly created cluster.
50		}
51		\examples{
52		cl <- makePSOCKcluster(2)
53		registerDoParallel(cl)
54		m <- matrix(rnorm(9), 3, 3)
55		foreach(i=1:nrow(m), .combine=rbind) %dopar% (m[i,] / mean(m[i,]))
56		stopCluster(cl)
57		}
58		\keyword{utilities}
	0	\name{registerDoParallel}
	1	\alias{registerDoParallel}
	2	\alias{stopImplicitCluster}
	3	\title{registerDoParallel}
	4	\description{
	5	The \code{registerDoParallel} function is used to register the
	6	parallel backend with the \code{foreach} package.
	7	}
	8	\usage{
	9	registerDoParallel(cl, cores=NULL, \dots)
	10	stopImplicitCluster()
	11	}
	12	\arguments{
	13	\item{cl}{A cluster object as returned by \code{makeCluster}, or the number
	14	of nodes to be created in the cluster. If not specified, on Windows a
	15	three worker cluster is created and used.}
	16	\item{cores}{The number of cores to use for parallel execution. If not
	17	specified, the number of cores is set to the value of
	18	\code{options("cores")}, if specified, or to one-half the number of cores detected
	19	by the \code{parallel} package.}
	20	\item{\dots}{Package options. Currently, only the \code{nocompile} option
	21	is supported. If \code{nocompile} is set to \code{TRUE}, compiler
	22	support is disabled.}
	23	}
	24	\details{
	25	The \code{parallel} package from R 2.14.0 and later provides functions for
	26	parallel execution of R code on machines with multiple cores or processors
	27	or multiple computers. It is essentially a blend of the \code{snow} and
	28	\code{multicore} packages. By default, the \code{doParallel} package uses
	29	\code{snow}-like functionality. The \code{snow}-like functionality
	30	should work fine on Unix-like systems, but the \code{multicore}-like
	31	functionality is limited to a single sequential worker on Windows systems.
	32	On workstations with multiple cores running Unix-like operating systems,
	33	the system \code{fork} call is used to spawn copies of the current process.
	34
	35	The \code{doParallel} backend supports both multicore and snow options passed
	36	through the \code{foreach} function.
	37	The supported multicore options are \code{preschedule}, \code{set.seed},
	38	\code{silent}, and \code{cores}, which are analogous to the similarly named
	39	arguments to \code{\link{mclapply}}, and are passed using the
	40	\code{.options.multicore} argument to \code{foreach}. The supported snow options are
	41	\code{preschedule}, which like its multicore analog can be used to chunk the
	42	tasks so that each worker gets a prescheduled chunk of tasks, and
	43	\code{attachExportEnv}, which can be used to attach the export environment
	44	in certain cases where R's lexical scoping is unable to find a needed
	45	export. The snow options are passed to \code{foreach} using the \code{.options.snow}
	46	argument.
	47
	48	The function \code{stopImplicitCluster} can be used in vignettes and other places
	49	where it is important to explicitly close the implicitly created cluster.
	50	}
	51	\examples{
	52	cl <- makePSOCKcluster(2)
	53	registerDoParallel(cl)
	54	m <- matrix(rnorm(9), 3, 3)
	55	foreach(i=1:nrow(m), .combine=rbind) %dopar% (m[i,] / mean(m[i,]))
	56	stopCluster(cl)
	57	}
	58	\keyword{utilities}

+344

-344

vignettes/gettingstartedParallel.Rnw less more

0		% \VignetteIndexEntry{Getting Started with doParallel and foreach}
1		% \VignetteDepends{doParallel}
2		% \VignetteDepends{foreach}
3		% \VignettePackage{doParallel}
4		\documentclass[12pt]{article}
5		\usepackage{amsmath}
6		\usepackage[pdftex]{graphicx}
7		\usepackage{color}
8		\usepackage{xspace}
9		\usepackage{url}
10		\usepackage{fancyvrb}
11		\usepackage{fancyhdr}
12		\usepackage[
13		colorlinks=true,
14		linkcolor=blue,
15		citecolor=blue,
16		urlcolor=blue]
17		{hyperref}
18		\usepackage{lscape}
19
20		\usepackage{Sweave}
21
22		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
23
24		% define new colors for use
25		\definecolor{darkgreen}{rgb}{0,0.6,0}
26		\definecolor{darkred}{rgb}{0.6,0.0,0}
27		\definecolor{lightbrown}{rgb}{1,0.9,0.8}
28		\definecolor{brown}{rgb}{0.6,0.3,0.3}
29		\definecolor{darkblue}{rgb}{0,0,0.8}
30		\definecolor{darkmagenta}{rgb}{0.5,0,0.5}
31
32		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
33
34		\newcommand{\bld}[1]{\mbox{\boldmath $#1$}}
35		\newcommand{\shell}[1]{\mbox{$#1$}}
36		\renewcommand{\vec}[1]{\mbox{\bf {#1}}}
37
38		\newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
39		\newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
40
41		\newcommand{\halfs}{\frac{1}{2}}
42
43		\setlength{\oddsidemargin}{-.25 truein}
44		\setlength{\evensidemargin}{0truein}
45		\setlength{\topmargin}{-0.2truein}
46		\setlength{\textwidth}{7 truein}
47		\setlength{\textheight}{8.5 truein}
48		\setlength{\parindent}{0.20truein}
49		\setlength{\parskip}{0.10truein}
50
51		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
52		\pagestyle{fancy}
53		\lhead{}
54		\chead{Getting Started with doParallel and foreach}
55		\rhead{}
56		\lfoot{}
57		\cfoot{}
58		\rfoot{\thepage}
59		\renewcommand{\headrulewidth}{1pt}
60		\renewcommand{\footrulewidth}{1pt}
61		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
62
63		\title{Getting Started with doParallel and foreach}
64		\author{Steve Weston\footnote{Steve Weston wrote the original version of this vignette for the doMC package. Rich Calaway
65		adapted the vignette for doParallel.} and Rich Calaway \\ doc@revolutionanalytics.com}
66
67
68		\begin{document}
69
70		\maketitle
71
72		\thispagestyle{empty}
73
74		\section{Introduction}
75
76		The \texttt{doParallel} package is a ``parallel backend'' for the
77		\texttt{foreach} package. It provides a mechanism needed to execute
78		\texttt{foreach} loops in parallel. The \texttt{foreach} package must
79		be used in conjunction with a package such as \texttt{doParallel} in order to
80		execute code in parallel. The user must register a parallel backend to
81		use, otherwise \texttt{foreach} will execute tasks sequentially, even
82		when the \%dopar\% operator is used.\footnote{\texttt{foreach} will
83		issue a warning that it is running sequentially if no parallel backend
84		has been registered. It will only issue this warning once, however.}
85
86		The \texttt{doParallel} package acts as an interface between \texttt{foreach}
87		and the \texttt{parallel} package of R 2.14.0 and later. The \texttt{parallel}
88		package is essentially a merger of the \texttt{multicore} package, which was
89		written by Simon Urbanek, and the \texttt{snow} package, which was written
90		by Luke Tierney and others. The \texttt{multicore} functionality supports
91		multiple workers only on those operating systems that
92		support the \texttt{fork} system call; this excludes Windows. By default,
93		\texttt{doParallel} uses \texttt{multicore} functionality on Unix-like
94		systems and \texttt{snow} functionality on Windows. Note that
95		the \texttt{multicore} functionality only runs tasks on a single
96		computer, not a cluster of computers. However, you can use the
97		\texttt{snow} functionality to execute on a cluster, using Unix-like
98		operating systems, Windows, or even a combination.
99		It is pointless to use \texttt{doParallel} and \texttt{parallel}
100		on a machine with only one processor with a single core. To get a speed
101		improvement, it must run on a machine with multiple processors, multiple
102		cores, or both.
103
104		\section{A word of caution}
105
106		Because the \texttt{parallel} package in \texttt{multicore} mode
107		starts its workers using
108		\texttt{fork} without doing a subsequent \texttt{exec}, it has some
109		limitations. Some operations cannot be performed properly by forked
110		processes. For example, connection objects very likely won't work.
111		In some cases, this could cause an object to become corrupted, and
112		the R session to crash.
113
114		\section{Registering the \texttt{doParallel} parallel backend}
115
116		To register \texttt{doParallel} to be used with \texttt{foreach}, you must
117		call the \texttt{registerDoParallel} function. If you call this with no
118		arguments, on Windows you will get three workers and on Unix-like
119		systems you will get a number of workers equal to approximately half the
120		number of cores on your system. You can also specify a cluster
121		(as created by the \texttt{makeCluster} function) or a number of cores.
122		The \texttt{cores} argument specifies the number of worker
123		processes that \texttt{doParallel} will use to execute tasks, which will
124		by default be
125		equal to one-half the total number of cores on the machine. You don't need to
126		specify a value for it, however. By default, \texttt{doParallel} will use the
127		value of the ``cores'' option, as specified with
128		the standard ``options'' function. If that isn't set, then
129		\texttt{doParallel} will try to detect the number of cores, and use one-half
130		that many workers.
131
132		Remember: unless \texttt{registerDoMC} is called, \texttt{foreach} will
133		{\em not} run in parallel. Simply loading the \texttt{doParallel} package is
134		not enough.
135
136		\section{An example \texttt{doParallel} session}
137
138		Before we go any further, let's load \texttt{doParallel}, register it, and use
139		it with \texttt{foreach}. We will use \texttt{snow}-like functionality in this
140		vignette, so we start by loading the package and starting a cluster:
141
142		<<loadLibs>>=
143		library(doParallel)
144		cl <- makeCluster(2)
145		registerDoParallel(cl)
146		foreach(i=1:3) %dopar% sqrt(i)
147		@
148		<<echo=FALSE>>=
149		stopCluster(cl)
150		@
151
152		To use \texttt{multicore}-like functionality, we would specify the number
153		of cores to use instead (but note that on Windows, attempting to use more
154		than one core with \texttt{parallel} results in an error):
155		\begin{verbatim}
156		library(doParallel}
157		registerDoParallel(cores=2)
158		foreach(i=1:3) %dopar% sqrt(i)
159		\end{verbatim}
160
161		\begin{quote}
162		Note well that this is {\em not} a practical use of \texttt{doParallel}. This
163		is our ``Hello, world'' program for parallel computing. It tests that
164		everything is installed and set up properly, but don't expect it to run
165		faster than a sequential \texttt{for} loop, because it won't!
166		\texttt{sqrt} executes far too quickly to be worth executing in
167		parallel, even with a large number of iterations. With small tasks, the
168		overhead of scheduling the task and returning the result can be greater
169		than the time to execute the task itself, resulting in poor performance.
170		In addition, this example doesn't make use of the vector capabilities of
171		\texttt{sqrt}, which it must to get decent performance. This is just a
172		test and a pedagogical example, {\em not} a benchmark.
173		\end{quote}
174
175		But returning to the point of this example, you can see that it is very
176		simple to load \texttt{doParallel} with all of its dependencies
177		(\texttt{foreach}, \texttt{iterators}, \texttt{parallel}, etc), and to
178		register it. For the rest of the R session, whenever you execute
179		\texttt{foreach} with \texttt{\%dopar\%}, the tasks will be executed
180		using \texttt{doParallel} and \texttt{parallel}. Note that you can register
181		a different parallel backend later, or deregister \texttt{doParallel} by
182		registering the sequential backend by calling the \texttt{registerDoSEQ}
183		function.
184
185		\section{A more serious example}
186
187		Now that we've gotten our feet wet, let's do something a bit less
188		trivial. One good example is bootstrapping. Let's see how long it
189		takes to run 10,000 bootstrap iterations in parallel on
190		\Sexpr{getDoParWorkers()} cores:
191
192		<<echo=FALSE>>=
193		library(doParallel)
194		cl <- makeCluster(2)
195		registerDoParallel(cl)
196		@
197		<<bootpar>>=
198		x <- iris[which(iris[,5] != "setosa"), c(1,5)]
199		trials <- 10000
200
201		ptime <- system.time({
202		r <- foreach(icount(trials), .combine=cbind) %dopar% {
203		ind <- sample(100, 100, replace=TRUE)
204		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
205		coefficients(result1)
206		}
207		})[3]
208		ptime
209		@
210
211		Using \texttt{doParallel} and \texttt{parallel} we were able to perform
212		10,000 bootstrap iterations in \Sexpr{ptime} seconds on
213		\Sexpr{getDoParWorkers()} cores. By changing the \texttt{\%dopar\%} to
214		\texttt{\%do\%}, we can run the same code sequentially to determine the
215		performance improvement:
216
217		<<bootseq>>=
218		stime <- system.time({
219		r <- foreach(icount(trials), .combine=cbind) %do% {
220		ind <- sample(100, 100, replace=TRUE)
221		result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
222		coefficients(result1)
223		}
224		})[3]
225		stime
226		@
227
228
229		The sequential version ran in \Sexpr{stime} seconds, which means the
230		speed up is about \Sexpr{round(stime / ptime, digits=1)} on
231		\Sexpr{getDoParWorkers()} workers.\footnote{If you build this vignette
232		yourself, you can see how well this problem runs on your hardware. None
233		of the times are hardcoded in this document. You can also run the same
234		example which is in the examples directory of the \texttt{doParallel}
235		distribution.} Ideally, the speed up would be \Sexpr{getDoParWorkers()},
236		but no multicore CPUs are ideal, and neither are the operating systems
237		and software that run on them.
238
239		At any rate, this is a more realistic example that is worth executing in
240		parallel. We do not explain what it's doing or how it works
241		here. We just want to give you something more substantial than the
242		\texttt{sqrt} example in case you want to run some benchmarks yourself.
243		You can also run this example on a cluster by simply reregistering
244		with a cluster object that specifies the nodes to use. (See the
245		\texttt{makeCluster} help file for more details.)
246
247		\section{Getting information about the parallel backend}
248
249		To find out how many workers \texttt{foreach} is going to use, you can
250		use the \texttt{getDoParWorkers} function:
251
252		<<getDoParWorkers>>=
253		getDoParWorkers()
254		@
255
256		This is a useful sanity check that you're actually running in parallel.
257		If you haven't registered a parallel backend, or if your machine only
258		has one core, \texttt{getDoParWorkers} will return one. In either case,
259		don't expect a speed improvement. \texttt{foreach} is clever, but it
260		isn't magic.
261
262		The \texttt{getDoParWorkers} function is also useful when you want the
263		number of tasks to be equal to the number of workers. You may want to
264		pass this value to an iterator constructor, for example.
265
266		You can also get the name and version of the currently registered
267		backend:
268
269		<<getDoParName>>=
270		getDoParName()
271		getDoParVersion()
272		@
273		<<echo=FALSE>>=
274		stopCluster(cl)
275		@
276		This is mostly useful for documentation purposes, or for checking that
277		you have the most recent version of \texttt{doParallel}.
278
279		\section{Specifying multicore options}
280
281		When using \texttt{multicore}-like functionality, the \texttt{doParallel} package allows
282		you to specify various options when
283		running \texttt{foreach} that are supported by the underlying
284		\texttt{mclapply} function: ``preschedule'', ``set.seed'', ``silent'',
285		and ``cores''. You can learn about these options from the
286		\texttt{mclapply} man page. They are set using the \texttt{foreach}
287		\texttt{.options.multicore} argument. Here's an example of how to do
288		that:
289
290		\begin{verbatim}
291		mcoptions <- list(preschedule=FALSE, set.seed=FALSE)
292		foreach(i=1:3, .options.multicore=mcoptions) %dopar% sqrt(i)
293		\end{verbatim}
294
295		The ``cores'' options allows you to temporarily override the number of
296		workers to use for a single \texttt{foreach} operation. This is more
297		convenient than having to re-register \texttt{doParallel}. Although if no
298		value of ``cores'' was specified when \texttt{doParallel} was registered, you
299		can also change this value dynamically using the \texttt{options}
300		function:
301
302		\begin{verbatim}
303		options(cores=2)
304		getDoParWorkers()
305		options(cores=3)
306		getDoParWorkers()
307		\end{verbatim}
308
309		If you did specify the number of cores when registering \texttt{doParallel},
310		the ``cores'' option is ignored:
311
312		\begin{verbatim}
313		registerDoParallel(4)
314		options(cores=2)
315		getDoParWorkers()
316		\end{verbatim}
317
318		As you can see, there are a number of options for controlling the number
319		of workers to use with \texttt{parallel}, but the default behaviour
320		usually does what you want.
321
322		\section{Stopping your cluster}
323
324		If you are using \texttt{snow}-like functionality, you will want to stop your
325		cluster when you are done using it. The \texttt{doParallel} package's
326		\texttt{.onUnload} function will do this automatically if the cluster was created
327		automatically by \texttt{registerDoParallel}, but if you created the cluster manually
328		you should stop it using the \texttt{stopCluster} function:
329
330		\begin{verbatim}
331		stopCluster(cl)
332		\end{verbatim}
333
334		\section{Conclusion}
335
336		The \texttt{doParallel} and \texttt{parallel} packages provide a nice,
337		efficient parallel programming platform for multiprocessor/multicore
338		computers running operating systems such as Linux and Mac OS X. It is
339		very easy to install, and very easy to use. In short order, an average
340		R programmer can start executing parallel programs, without any previous
341		experience in parallel computing.
342
343		\end{document}
	0	% \VignetteIndexEntry{Getting Started with doParallel and foreach}
	1	% \VignetteDepends{doParallel}
	2	% \VignetteDepends{foreach}
	3	% \VignettePackage{doParallel}
	4	\documentclass[12pt]{article}
	5	\usepackage{amsmath}
	6	\usepackage[pdftex]{graphicx}
	7	\usepackage{color}
	8	\usepackage{xspace}
	9	\usepackage{url}
	10	\usepackage{fancyvrb}
	11	\usepackage{fancyhdr}
	12	\usepackage[
	13	colorlinks=true,
	14	linkcolor=blue,
	15	citecolor=blue,
	16	urlcolor=blue]
	17	{hyperref}
	18	\usepackage{lscape}
	19
	20	\usepackage{Sweave}
	21
	22	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	23
	24	% define new colors for use
	25	\definecolor{darkgreen}{rgb}{0,0.6,0}
	26	\definecolor{darkred}{rgb}{0.6,0.0,0}
	27	\definecolor{lightbrown}{rgb}{1,0.9,0.8}
	28	\definecolor{brown}{rgb}{0.6,0.3,0.3}
	29	\definecolor{darkblue}{rgb}{0,0,0.8}
	30	\definecolor{darkmagenta}{rgb}{0.5,0,0.5}
	31
	32	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	33
	34	\newcommand{\bld}[1]{\mbox{\boldmath $#1$}}
	35	\newcommand{\shell}[1]{\mbox{$#1$}}
	36	\renewcommand{\vec}[1]{\mbox{\bf {#1}}}
	37
	38	\newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
	39	\newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
	40
	41	\newcommand{\halfs}{\frac{1}{2}}
	42
	43	\setlength{\oddsidemargin}{-.25 truein}
	44	\setlength{\evensidemargin}{0truein}
	45	\setlength{\topmargin}{-0.2truein}
	46	\setlength{\textwidth}{7 truein}
	47	\setlength{\textheight}{8.5 truein}
	48	\setlength{\parindent}{0.20truein}
	49	\setlength{\parskip}{0.10truein}
	50
	51	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	52	\pagestyle{fancy}
	53	\lhead{}
	54	\chead{Getting Started with doParallel and foreach}
	55	\rhead{}
	56	\lfoot{}
	57	\cfoot{}
	58	\rfoot{\thepage}
	59	\renewcommand{\headrulewidth}{1pt}
	60	\renewcommand{\footrulewidth}{1pt}
	61	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	62
	63	\title{Getting Started with doParallel and foreach}
	64	\author{Steve Weston\footnote{Steve Weston wrote the original version of this vignette for the doMC package. Rich Calaway
	65	adapted the vignette for doParallel.} and Rich Calaway}
	66
	67
	68	\begin{document}
	69
	70	\maketitle
	71
	72	\thispagestyle{empty}
	73
	74	\section{Introduction}
	75
	76	The \texttt{doParallel} package is a ``parallel backend'' for the
	77	\texttt{foreach} package. It provides a mechanism needed to execute
	78	\texttt{foreach} loops in parallel. The \texttt{foreach} package must
	79	be used in conjunction with a package such as \texttt{doParallel} in order to
	80	execute code in parallel. The user must register a parallel backend to
	81	use, otherwise \texttt{foreach} will execute tasks sequentially, even
	82	when the \%dopar\% operator is used.\footnote{\texttt{foreach} will
	83	issue a warning that it is running sequentially if no parallel backend
	84	has been registered. It will only issue this warning once, however.}
	85
	86	The \texttt{doParallel} package acts as an interface between \texttt{foreach}
	87	and the \texttt{parallel} package of R 2.14.0 and later. The \texttt{parallel}
	88	package is essentially a merger of the \texttt{multicore} package, which was
	89	written by Simon Urbanek, and the \texttt{snow} package, which was written
	90	by Luke Tierney and others. The \texttt{multicore} functionality supports
	91	multiple workers only on those operating systems that
	92	support the \texttt{fork} system call; this excludes Windows. By default,
	93	\texttt{doParallel} uses \texttt{multicore} functionality on Unix-like
	94	systems and \texttt{snow} functionality on Windows. Note that
	95	the \texttt{multicore} functionality only runs tasks on a single
	96	computer, not a cluster of computers. However, you can use the
	97	\texttt{snow} functionality to execute on a cluster, using Unix-like
	98	operating systems, Windows, or even a combination.
	99	It is pointless to use \texttt{doParallel} and \texttt{parallel}
	100	on a machine with only one processor with a single core. To get a speed
	101	improvement, it must run on a machine with multiple processors, multiple
	102	cores, or both.
	103
	104	\section{A word of caution}
	105
	106	Because the \texttt{parallel} package in \texttt{multicore} mode
	107	starts its workers using
	108	\texttt{fork} without doing a subsequent \texttt{exec}, it has some
	109	limitations. Some operations cannot be performed properly by forked
	110	processes. For example, connection objects very likely won't work.
	111	In some cases, this could cause an object to become corrupted, and
	112	the R session to crash.
	113
	114	\section{Registering the \texttt{doParallel} parallel backend}
	115
	116	To register \texttt{doParallel} to be used with \texttt{foreach}, you must
	117	call the \texttt{registerDoParallel} function. If you call this with no
	118	arguments, on Windows you will get three workers and on Unix-like
	119	systems you will get a number of workers equal to approximately half the
	120	number of cores on your system. You can also specify a cluster
	121	(as created by the \texttt{makeCluster} function) or a number of cores.
	122	The \texttt{cores} argument specifies the number of worker
	123	processes that \texttt{doParallel} will use to execute tasks, which will
	124	by default be
	125	equal to one-half the total number of cores on the machine. You don't need to
	126	specify a value for it, however. By default, \texttt{doParallel} will use the
	127	value of the ``cores'' option, as specified with
	128	the standard ``options'' function. If that isn't set, then
	129	\texttt{doParallel} will try to detect the number of cores, and use one-half
	130	that many workers.
	131
	132	Remember: unless \texttt{registerDoMC} is called, \texttt{foreach} will
	133	{\em not} run in parallel. Simply loading the \texttt{doParallel} package is
	134	not enough.
	135
	136	\section{An example \texttt{doParallel} session}
	137
	138	Before we go any further, let's load \texttt{doParallel}, register it, and use
	139	it with \texttt{foreach}. We will use \texttt{snow}-like functionality in this
	140	vignette, so we start by loading the package and starting a cluster:
	141
	142	<<loadLibs>>=
	143	library(doParallel)
	144	cl <- makeCluster(2)
	145	registerDoParallel(cl)
	146	foreach(i=1:3) %dopar% sqrt(i)
	147	@
	148	<<echo=FALSE>>=
	149	stopCluster(cl)
	150	@
	151
	152	To use \texttt{multicore}-like functionality, we would specify the number
	153	of cores to use instead (but note that on Windows, attempting to use more
	154	than one core with \texttt{parallel} results in an error):
	155	\begin{verbatim}
	156	library(doParallel)
	157	registerDoParallel(cores=2)
	158	foreach(i=1:3) %dopar% sqrt(i)
	159	\end{verbatim}
	160
	161	\begin{quote}
	162	Note well that this is {\em not} a practical use of \texttt{doParallel}. This
	163	is our ``Hello, world'' program for parallel computing. It tests that
	164	everything is installed and set up properly, but don't expect it to run
	165	faster than a sequential \texttt{for} loop, because it won't!
	166	\texttt{sqrt} executes far too quickly to be worth executing in
	167	parallel, even with a large number of iterations. With small tasks, the
	168	overhead of scheduling the task and returning the result can be greater
	169	than the time to execute the task itself, resulting in poor performance.
	170	In addition, this example doesn't make use of the vector capabilities of
	171	\texttt{sqrt}, which it must to get decent performance. This is just a
	172	test and a pedagogical example, {\em not} a benchmark.
	173	\end{quote}
	174
	175	But returning to the point of this example, you can see that it is very
	176	simple to load \texttt{doParallel} with all of its dependencies
	177	(\texttt{foreach}, \texttt{iterators}, \texttt{parallel}, etc), and to
	178	register it. For the rest of the R session, whenever you execute
	179	\texttt{foreach} with \texttt{\%dopar\%}, the tasks will be executed
	180	using \texttt{doParallel} and \texttt{parallel}. Note that you can register
	181	a different parallel backend later, or deregister \texttt{doParallel} by
	182	registering the sequential backend by calling the \texttt{registerDoSEQ}
	183	function.
	184
	185	\section{A more serious example}
	186
	187	Now that we've gotten our feet wet, let's do something a bit less
	188	trivial. One good example is bootstrapping. Let's see how long it
	189	takes to run 10,000 bootstrap iterations in parallel on
	190	\Sexpr{getDoParWorkers()} cores:
	191
	192	<<echo=FALSE>>=
	193	library(doParallel)
	194	cl <- makeCluster(2)
	195	registerDoParallel(cl)
	196	@
	197	<<bootpar>>=
	198	x <- iris[which(iris[,5] != "setosa"), c(1,5)]
	199	trials <- 10000
	200
	201	ptime <- system.time({
	202	r <- foreach(icount(trials), .combine=cbind) %dopar% {
	203	ind <- sample(100, 100, replace=TRUE)
	204	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	205	coefficients(result1)
	206	}
	207	})[3]
	208	ptime
	209	@
	210
	211	Using \texttt{doParallel} and \texttt{parallel} we were able to perform
	212	10,000 bootstrap iterations in \Sexpr{ptime} seconds on
	213	\Sexpr{getDoParWorkers()} cores. By changing the \texttt{\%dopar\%} to
	214	\texttt{\%do\%}, we can run the same code sequentially to determine the
	215	performance improvement:
	216
	217	<<bootseq>>=
	218	stime <- system.time({
	219	r <- foreach(icount(trials), .combine=cbind) %do% {
	220	ind <- sample(100, 100, replace=TRUE)
	221	result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
	222	coefficients(result1)
	223	}
	224	})[3]
	225	stime
	226	@
	227
	228
	229	The sequential version ran in \Sexpr{stime} seconds, which means the
	230	speed up is about \Sexpr{round(stime / ptime, digits=1)} on
	231	\Sexpr{getDoParWorkers()} workers.\footnote{If you build this vignette
	232	yourself, you can see how well this problem runs on your hardware. None
	233	of the times are hardcoded in this document. You can also run the same
	234	example which is in the examples directory of the \texttt{doParallel}
	235	distribution.} Ideally, the speed up would be \Sexpr{getDoParWorkers()},
	236	but no multicore CPUs are ideal, and neither are the operating systems
	237	and software that run on them.
	238
	239	At any rate, this is a more realistic example that is worth executing in
	240	parallel. We do not explain what it's doing or how it works
	241	here. We just want to give you something more substantial than the
	242	\texttt{sqrt} example in case you want to run some benchmarks yourself.
	243	You can also run this example on a cluster by simply reregistering
	244	with a cluster object that specifies the nodes to use. (See the
	245	\texttt{makeCluster} help file for more details.)
	246
	247	\section{Getting information about the parallel backend}
	248
	249	To find out how many workers \texttt{foreach} is going to use, you can
	250	use the \texttt{getDoParWorkers} function:
	251
	252	<<getDoParWorkers>>=
	253	getDoParWorkers()
	254	@
	255
	256	This is a useful sanity check that you're actually running in parallel.
	257	If you haven't registered a parallel backend, or if your machine only
	258	has one core, \texttt{getDoParWorkers} will return one. In either case,
	259	don't expect a speed improvement. \texttt{foreach} is clever, but it
	260	isn't magic.
	261
	262	The \texttt{getDoParWorkers} function is also useful when you want the
	263	number of tasks to be equal to the number of workers. You may want to
	264	pass this value to an iterator constructor, for example.
	265
	266	You can also get the name and version of the currently registered
	267	backend:
	268
	269	<<getDoParName>>=
	270	getDoParName()
	271	getDoParVersion()
	272	@
	273	<<echo=FALSE>>=
	274	stopCluster(cl)
	275	@
	276	This is mostly useful for documentation purposes, or for checking that
	277	you have the most recent version of \texttt{doParallel}.
	278
	279	\section{Specifying multicore options}
	280
	281	When using \texttt{multicore}-like functionality, the \texttt{doParallel} package allows
	282	you to specify various options when
	283	running \texttt{foreach} that are supported by the underlying
	284	\texttt{mclapply} function: ``preschedule'', ``set.seed'', ``silent'',
	285	and ``cores''. You can learn about these options from the
	286	\texttt{mclapply} man page. They are set using the \texttt{foreach}
	287	\texttt{.options.multicore} argument. Here's an example of how to do
	288	that:
	289
	290	\begin{verbatim}
	291	mcoptions <- list(preschedule=FALSE, set.seed=FALSE)
	292	foreach(i=1:3, .options.multicore=mcoptions) %dopar% sqrt(i)
	293	\end{verbatim}
	294
	295	The ``cores'' options allows you to temporarily override the number of
	296	workers to use for a single \texttt{foreach} operation. This is more
	297	convenient than having to re-register \texttt{doParallel}. Although if no
	298	value of ``cores'' was specified when \texttt{doParallel} was registered, you
	299	can also change this value dynamically using the \texttt{options}
	300	function:
	301
	302	\begin{verbatim}
	303	options(cores=2)
	304	getDoParWorkers()
	305	options(cores=3)
	306	getDoParWorkers()
	307	\end{verbatim}
	308
	309	If you did specify the number of cores when registering \texttt{doParallel},
	310	the ``cores'' option is ignored:
	311
	312	\begin{verbatim}
	313	registerDoParallel(4)
	314	options(cores=2)
	315	getDoParWorkers()
	316	\end{verbatim}
	317
	318	As you can see, there are a number of options for controlling the number
	319	of workers to use with \texttt{parallel}, but the default behaviour
	320	usually does what you want.
	321
	322	\section{Stopping your cluster}
	323
	324	If you are using \texttt{snow}-like functionality, you will want to stop your
	325	cluster when you are done using it. The \texttt{doParallel} package's
	326	\texttt{.onUnload} function will do this automatically if the cluster was created
	327	automatically by \texttt{registerDoParallel}, but if you created the cluster manually
	328	you should stop it using the \texttt{stopCluster} function:
	329
	330	\begin{verbatim}
	331	stopCluster(cl)
	332	\end{verbatim}
	333
	334	\section{Conclusion}
	335
	336	The \texttt{doParallel} and \texttt{parallel} packages provide a nice,
	337	efficient parallel programming platform for multiprocessor/multicore
	338	computers running operating systems such as Linux and Mac OS X. It is
	339	very easy to install, and very easy to use. In short order, an average
	340	R programmer can start executing parallel programs, without any previous
	341	experience in parallel computing.
	342
	343	\end{document}