error in install.packages() (PR#14042)

19 messages Options
Embed this post
Permalink
Michael Spiegel-2

error in install.packages() (PR#14042)

Reply Threaded More More options
Print post
Permalink
Full_Name: Michael Spiegel
Version: 2.10
OS: Windows Vista
Submission from: (NULL) (76.104.24.156)


The following error is produced when attempting to call install.packages.  Here
is the results of the traceback:

> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
Error in f(res) : invalid subscript type 'list'
> traceback()
7: f(res)
6: available.packages(contriburl = contriburl, method = method)
5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl = contriburl,
       method = method, available = available, destdir = destdir,
       dependencies = dependencies, ...)
4: install.packages(pkgs = c("OpenMx"), repos = repos)
3: eval.with.vis(expr, envir, enclos)
2: eval.with.vis(ei, envir)
1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")

I've tracked the error down to somewhere in available.packages defined in
src\library\utils\R\packages.R.  I am guessing that the error in version 2.10
has something to do with the change: "available.packages() gains a 'filters'
argument for specifying the filtering operations performed on the packages found
in the repositories."

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Duncan Murdoch

Re: error in install.packages() (PR#14042)

Reply Threaded More More options
Print post
Permalink
On 11/4/2009 11:05 AM, [hidden email] wrote:

> Full_Name: Michael Spiegel
> Version: 2.10
> OS: Windows Vista
> Submission from: (NULL) (76.104.24.156)
>
>
> The following error is produced when attempting to call install.packages.  Here
> is the results of the traceback:
>
>> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
> Error in f(res) : invalid subscript type 'list'
>> traceback()
> 7: f(res)
> 6: available.packages(contriburl = contriburl, method = method)
> 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl = contriburl,
>        method = method, available = available, destdir = destdir,
>        dependencies = dependencies, ...)
> 4: install.packages(pkgs = c("OpenMx"), repos = repos)
> 3: eval.with.vis(expr, envir, enclos)
> 2: eval.with.vis(ei, envir)
> 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
>
> I've tracked the error down to somewhere in available.packages defined in
> src\library\utils\R\packages.R.  I am guessing that the error in version 2.10
> has something to do with the change: "available.packages() gains a 'filters'
> argument for specifying the filtering operations performed on the packages found
> in the repositories."
>

I can reproduce this, and yes, it is happening in one of the filters.
I'd guess it happens because your repository has only one entry (a
missed "drop=FALSE" somewhere maybe) or because the filter is finding no
matches.  I'll track down the details and fix it.

Thanks for the report.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Duncan Murdoch

Re: error in install.packages() (PR#14042)

Reply Threaded More More options
Print post
Permalink
In reply to this post by Michael Spiegel-2
On 11/4/2009 11:05 AM, [hidden email] wrote:

> Full_Name: Michael Spiegel
> Version: 2.10
> OS: Windows Vista
> Submission from: (NULL) (76.104.24.156)
>
>
> The following error is produced when attempting to call install.packages.  Here
> is the results of the traceback:
>
>> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
> Error in f(res) : invalid subscript type 'list'
>> traceback()
> 7: f(res)
> 6: available.packages(contriburl = contriburl, method = method)
> 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl = contriburl,
>        method = method, available = available, destdir = destdir,
>        dependencies = dependencies, ...)
> 4: install.packages(pkgs = c("OpenMx"), repos = repos)
> 3: eval.with.vis(expr, envir, enclos)
> 2: eval.with.vis(ei, envir)
> 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
>
> I've tracked the error down to somewhere in available.packages defined in
> src\library\utils\R\packages.R.  I am guessing that the error in version 2.10
> has something to do with the change: "available.packages() gains a 'filters'
> argument for specifying the filtering operations performed on the packages found
> in the repositories."

I've found the error, and will fix and commit to R-devel and R-patched.

For future reference:  the problem was that it assigned the result of
sapply() to a subset of a vector.  Normally sapply() simplifies its
result to a vector, but in this case the result was empty, so sapply()
returned an empty list; assigning a list to a vector coerced the vector
to a list, and then the "invalid subscript type 'list'" came soon after.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
William Dunlap

Re: error in install.packages() (PR#14042)

Reply Threaded More More options
Print post
Permalink
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Duncan Murdoch
> Sent: Wednesday, November 04, 2009 8:47 AM
> To: [hidden email]
> Cc: [hidden email]; [hidden email]
> Subject: Re: [Rd] error in install.packages() (PR#14042)
>
> On 11/4/2009 11:05 AM, [hidden email] wrote:
> > Full_Name: Michael Spiegel
> > Version: 2.10
> > OS: Windows Vista
> > Submission from: (NULL) (76.104.24.156)
> >
> >
> > The following error is produced when attempting to call
> install.packages.  Here
> > is the results of the traceback:
> >
> >> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
> > Error in f(res) : invalid subscript type 'list'
> >> traceback()
> > 7: f(res)
> > 6: available.packages(contriburl = contriburl, method = method)
> > 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl =
> contriburl,
> >        method = method, available = available, destdir = destdir,
> >        dependencies = dependencies, ...)
> > 4: install.packages(pkgs = c("OpenMx"), repos = repos)
> > 3: eval.with.vis(expr, envir, enclos)
> > 2: eval.with.vis(ei, envir)
> > 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
> >
> > I've tracked the error down to somewhere in
> available.packages defined in
> > src\library\utils\R\packages.R.  I am guessing that the
> error in version 2.10
> > has something to do with the change: "available.packages()
> gains a 'filters'
> > argument for specifying the filtering operations performed
> on the packages found
> > in the repositories."
>
> I've found the error, and will fix and commit to R-devel and
> R-patched.
>
> For future reference:  the problem was that it assigned the result of
> sapply() to a subset of a vector.  Normally sapply() simplifies its
> result to a vector, but in this case the result was empty, so
> sapply()
> returned an empty list; assigning a list to a vector coerced
> the vector
> to a list, and then the "invalid subscript type 'list'" came
> soon after.

I've run into this sort of problem a lot (0-long input to sapply
causes it to return list()).  A related problem is that when sapply's
FUN doesn't always return the type of value you expect for some
corner case then sapply won't do the expected simplication.  If
sapply had an argument that gave the expected form of FUN's output
then sapply could (a) die if some call to FUN didn't return something
of that form and (b) return a 0-long object of the correct form
if sapply's X has length zero so FUN is never called.  E.g.,
   sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
third iteration
   sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
logical(0)

Another benefit of sapply knowing the type of FUN's return value is
that it wouldn't have to waste space creating a list of FUN's return
values but could stuff them directly into the final output structure.
A list of n scalar doubles is 4.5 times bigger than double(n) and the
factor is 9.0 for integers and logicals.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Duncan Murdoch

Re: error in install.packages() (PR#14042)

Reply Threaded More More options
Print post
Permalink
On 11/4/2009 12:15 PM, William Dunlap wrote:

>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Duncan Murdoch
>> Sent: Wednesday, November 04, 2009 8:47 AM
>> To: [hidden email]
>> Cc: [hidden email]; [hidden email]
>> Subject: Re: [Rd] error in install.packages() (PR#14042)
>>
>> On 11/4/2009 11:05 AM, [hidden email] wrote:
>> > Full_Name: Michael Spiegel
>> > Version: 2.10
>> > OS: Windows Vista
>> > Submission from: (NULL) (76.104.24.156)
>> >
>> >
>> > The following error is produced when attempting to call
>> install.packages.  Here
>> > is the results of the traceback:
>> >
>> >> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
>> > Error in f(res) : invalid subscript type 'list'
>> >> traceback()
>> > 7: f(res)
>> > 6: available.packages(contriburl = contriburl, method = method)
>> > 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl =
>> contriburl,
>> >        method = method, available = available, destdir = destdir,
>> >        dependencies = dependencies, ...)
>> > 4: install.packages(pkgs = c("OpenMx"), repos = repos)
>> > 3: eval.with.vis(expr, envir, enclos)
>> > 2: eval.with.vis(ei, envir)
>> > 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
>> >
>> > I've tracked the error down to somewhere in
>> available.packages defined in
>> > src\library\utils\R\packages.R.  I am guessing that the
>> error in version 2.10
>> > has something to do with the change: "available.packages()
>> gains a 'filters'
>> > argument for specifying the filtering operations performed
>> on the packages found
>> > in the repositories."
>>
>> I've found the error, and will fix and commit to R-devel and
>> R-patched.
>>
>> For future reference:  the problem was that it assigned the result of
>> sapply() to a subset of a vector.  Normally sapply() simplifies its
>> result to a vector, but in this case the result was empty, so
>> sapply()
>> returned an empty list; assigning a list to a vector coerced
>> the vector
>> to a list, and then the "invalid subscript type 'list'" came
>> soon after.
>
> I've run into this sort of problem a lot (0-long input to sapply
> causes it to return list()).  A related problem is that when sapply's
> FUN doesn't always return the type of value you expect for some
> corner case then sapply won't do the expected simplication.  If
> sapply had an argument that gave the expected form of FUN's output
> then sapply could (a) die if some call to FUN didn't return something
> of that form and (b) return a 0-long object of the correct form
> if sapply's X has length zero so FUN is never called.  E.g.,
>    sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
> third iteration
>    sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
> logical(0)
>
> Another benefit of sapply knowing the type of FUN's return value is
> that it wouldn't have to waste space creating a list of FUN's return
> values but could stuff them directly into the final output structure.
> A list of n scalar doubles is 4.5 times bigger than double(n) and the
> factor is 9.0 for integers and logicals.

That sounds like a good idea.  It would be a bit of work, because the
current sapply depends on lapply while this would need its own internal
implementation:  but it would probably be worthwhile.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Prof Brian Ripley

Re: error in install.packages() (PR#14042)

Reply Threaded More More options
Print post
Permalink
I agree it is a good idea, but a new name seems justified to avoid
confusion.

On Wed, 4 Nov 2009, Duncan Murdoch wrote:

> On 11/4/2009 12:15 PM, William Dunlap wrote:
>>> -----Original Message-----
>>> From: [hidden email] [mailto:[hidden email]]
>>> On Behalf Of Duncan Murdoch
>>> Sent: Wednesday, November 04, 2009 8:47 AM
>>> To: [hidden email]
>>> Cc: [hidden email]; [hidden email]
>>> Subject: Re: [Rd] error in install.packages() (PR#14042)
>>>
>>> On 11/4/2009 11:05 AM, [hidden email] wrote:
>>> > Full_Name: Michael Spiegel
>>> > Version: 2.10
>>> > OS: Windows Vista
>>> > Submission from: (NULL) (76.104.24.156)
>>> > > > The following error is produced when attempting to call
>>> install.packages.  Here
>>> > is the results of the traceback:
>>> > >> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
>>> > Error in f(res) : invalid subscript type 'list'
>>> >> traceback()
>>> > 7: f(res)
>>> > 6: available.packages(contriburl = contriburl, method = method)
>>> > 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl = contriburl, >
>>> method = method, available = available, destdir = destdir, >
>>> dependencies = dependencies, ...)
>>> > 4: install.packages(pkgs = c("OpenMx"), repos = repos)
>>> > 3: eval.with.vis(expr, envir, enclos)
>>> > 2: eval.with.vis(ei, envir)
>>> > 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
>>> > > I've tracked the error down to somewhere in available.packages defined
>>> in
>>> > src\library\utils\R\packages.R.  I am guessing that the error in version
>>> 2.10
>>> > has something to do with the change: "available.packages() gains a
>>> 'filters'
>>> > argument for specifying the filtering operations performed on the
>>> packages found
>>> > in the repositories."
>>>
>>> I've found the error, and will fix and commit to R-devel and R-patched.
>>>
>>> For future reference:  the problem was that it assigned the result of
>>> sapply() to a subset of a vector.  Normally sapply() simplifies its result
>>> to a vector, but in this case the result was empty, so sapply() returned
>>> an empty list; assigning a list to a vector coerced the vector to a list,
>>> and then the "invalid subscript type 'list'" came soon after.
>>
>> I've run into this sort of problem a lot (0-long input to sapply
>> causes it to return list()).  A related problem is that when sapply's
>> FUN doesn't always return the type of value you expect for some
>> corner case then sapply won't do the expected simplication.  If
>> sapply had an argument that gave the expected form of FUN's output
>> then sapply could (a) die if some call to FUN didn't return something
>> of that form and (b) return a 0-long object of the correct form
>> if sapply's X has length zero so FUN is never called.  E.g.,
>>    sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
>> third iteration
>>    sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
>> logical(0)
>>
>> Another benefit of sapply knowing the type of FUN's return value is
>> that it wouldn't have to waste space creating a list of FUN's return
>> values but could stuff them directly into the final output structure.
>> A list of n scalar doubles is 4.5 times bigger than double(n) and the
>> factor is 9.0 for integers and logicals.
>
> That sounds like a good idea.  It would be a bit of work, because the current
> sapply depends on lapply while this would need its own internal
> implementation:  but it would probably be worthwhile.
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
William Dunlap

Re: error in install.packages() (PR#14042)

Reply Threaded More More options
Print post
Permalink
> -----Original Message-----
> From: Prof Brian Ripley [mailto:[hidden email]]
> Sent: Wednesday, November 04, 2009 12:19 PM
> To: Duncan Murdoch
> Cc: William Dunlap; [hidden email]
> Subject: Re: [Rd] error in install.packages() (PR#14042)
>
> I agree it is a good idea, but a new name seems justified to avoid
> confusion.

If you could decide on a good name for the new argument
and the format of the data in it I could implement it in
S+ and keep R & S+ compatible.  The format issue seems
bigger to me.  Giving a prototype of the expected return
value is very flexible but wastes a bit of space.  I propose
treating it much as the value of FUN(X[[1]]) is treated.
If the prototype included names then those could become the row
names of the matrix output, instead of the names on the
actual return values.  (I would ignore the row names
when asking if the expected return value sufficiently resembled
the actual one.)  E.g., the current
   > sapply(split(log(1:10), rep(letters[1:2],c(3,7))), quantile,
(1:2)/3)
                     a        b
   33.33333% 0.4620981 1.791759
   66.66667% 0.8283022 2.079442
with THE.NEW.ARGUMENT=c(T1=0,T2=0) would return
              a        b
   T1 0.4620981 1.791759
   T2 0.8283022 2.079442
(I don't know if that behavior is needed, but it is a correlary
of using THE.NEW.ARGUMENT instead of FUN(X[[1]]) as the source
of row names and perhaps other data.)

Should THE.NEW.ARGUMENT's mode have to match exactly the mode
of FUN(X[[i]]) or should it just be possible to coerce the value
of FUN(X[[i]]) to it?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

>
> On Wed, 4 Nov 2009, Duncan Murdoch wrote:
>
> > On 11/4/2009 12:15 PM, William Dunlap wrote:
> >>> -----Original Message-----
> >>> From: [hidden email]
> [mailto:[hidden email]]
> >>> On Behalf Of Duncan Murdoch
> >>> Sent: Wednesday, November 04, 2009 8:47 AM
> >>> To: [hidden email]
> >>> Cc: [hidden email]; [hidden email]
> >>> Subject: Re: [Rd] error in install.packages() (PR#14042)
> >>>
> >>> On 11/4/2009 11:05 AM, [hidden email] wrote:
> >>> > Full_Name: Michael Spiegel
> >>> > Version: 2.10
> >>> > OS: Windows Vista
> >>> > Submission from: (NULL) (76.104.24.156)
> >>> > > > The following error is produced when attempting to call
> >>> install.packages.  Here
> >>> > is the results of the traceback:
> >>> > >> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
> >>> > Error in f(res) : invalid subscript type 'list'
> >>> >> traceback()
> >>> > 7: f(res)
> >>> > 6: available.packages(contriburl = contriburl, method = method)
> >>> > 5: .install.winbinary(pkgs = pkgs, lib = lib,
> contriburl = contriburl, >
> >>> method = method, available = available, destdir = destdir, >
> >>> dependencies = dependencies, ...)
> >>> > 4: install.packages(pkgs = c("OpenMx"), repos = repos)
> >>> > 3: eval.with.vis(expr, envir, enclos)
> >>> > 2: eval.with.vis(ei, envir)
> >>> > 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
> >>> > > I've tracked the error down to somewhere in
> available.packages defined
> >>> in
> >>> > src\library\utils\R\packages.R.  I am guessing that the
> error in version
> >>> 2.10
> >>> > has something to do with the change:
> "available.packages() gains a
> >>> 'filters'
> >>> > argument for specifying the filtering operations
> performed on the
> >>> packages found
> >>> > in the repositories."
> >>>
> >>> I've found the error, and will fix and commit to R-devel
> and R-patched.
> >>>
> >>> For future reference:  the problem was that it assigned
> the result of
> >>> sapply() to a subset of a vector.  Normally sapply()
> simplifies its result
> >>> to a vector, but in this case the result was empty, so
> sapply() returned
> >>> an empty list; assigning a list to a vector coerced the
> vector to a list,
> >>> and then the "invalid subscript type 'list'" came soon after.
> >>
> >> I've run into this sort of problem a lot (0-long input to sapply
> >> causes it to return list()).  A related problem is that
> when sapply's
> >> FUN doesn't always return the type of value you expect for some
> >> corner case then sapply won't do the expected simplication.  If
> >> sapply had an argument that gave the expected form of FUN's output
> >> then sapply could (a) die if some call to FUN didn't
> return something
> >> of that form and (b) return a 0-long object of the correct form
> >> if sapply's X has length zero so FUN is never called.  E.g.,
> >>    sapply(2:0, function(i)(11:20)[i],
> FUN.VALUE=integer(1)) # die on
> >> third iteration
> >>    sapply(integer(0), function(i)i>0,
> FUN.VALUE=logical(1)) # return
> >> logical(0)
> >>
> >> Another benefit of sapply knowing the type of FUN's return value is
> >> that it wouldn't have to waste space creating a list of
> FUN's return
> >> values but could stuff them directly into the final output
> structure.
> >> A list of n scalar doubles is 4.5 times bigger than
> double(n) and the
> >> factor is 9.0 for integers and logicals.
> >
> > That sounds like a good idea.  It would be a bit of work,
> because the current
> > sapply depends on lapply while this would need its own internal
> > implementation:  but it would probably be worthwhile.
> >
> > Duncan Murdoch
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Brian D. Ripley,                  [hidden email]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Duncan Murdoch

sapply improvements

Reply Threaded More More options
Print post
Permalink
In reply to this post by William Dunlap
On 11/4/2009 12:15 PM, William Dunlap wrote:

>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Duncan Murdoch
>> Sent: Wednesday, November 04, 2009 8:47 AM
>> To: [hidden email]
>> Cc: [hidden email]; [hidden email]
>> Subject: Re: [Rd] error in install.packages() (PR#14042)
>>
>> On 11/4/2009 11:05 AM, [hidden email] wrote:
>> > Full_Name: Michael Spiegel
>> > Version: 2.10
>> > OS: Windows Vista
>> > Submission from: (NULL) (76.104.24.156)
>> >
>> >
>> > The following error is produced when attempting to call
>> install.packages.  Here
>> > is the results of the traceback:
>> >
>> >> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
>> > Error in f(res) : invalid subscript type 'list'
>> >> traceback()
>> > 7: f(res)
>> > 6: available.packages(contriburl = contriburl, method = method)
>> > 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl =
>> contriburl,
>> >        method = method, available = available, destdir = destdir,
>> >        dependencies = dependencies, ...)
>> > 4: install.packages(pkgs = c("OpenMx"), repos = repos)
>> > 3: eval.with.vis(expr, envir, enclos)
>> > 2: eval.with.vis(ei, envir)
>> > 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
>> >
>> > I've tracked the error down to somewhere in
>> available.packages defined in
>> > src\library\utils\R\packages.R.  I am guessing that the
>> error in version 2.10
>> > has something to do with the change: "available.packages()
>> gains a 'filters'
>> > argument for specifying the filtering operations performed
>> on the packages found
>> > in the repositories."
>>
>> I've found the error, and will fix and commit to R-devel and
>> R-patched.
>>
>> For future reference:  the problem was that it assigned the result of
>> sapply() to a subset of a vector.  Normally sapply() simplifies its
>> result to a vector, but in this case the result was empty, so
>> sapply()
>> returned an empty list; assigning a list to a vector coerced
>> the vector
>> to a list, and then the "invalid subscript type 'list'" came
>> soon after.
>
> I've run into this sort of problem a lot (0-long input to sapply
> causes it to return list()).  A related problem is that when sapply's
> FUN doesn't always return the type of value you expect for some
> corner case then sapply won't do the expected simplication.  If
> sapply had an argument that gave the expected form of FUN's output
> then sapply could (a) die if some call to FUN didn't return something
> of that form and (b) return a 0-long object of the correct form
> if sapply's X has length zero so FUN is never called.  E.g.,
>    sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
> third iteration
>    sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
> logical(0)
>
> Another benefit of sapply knowing the type of FUN's return value is
> that it wouldn't have to waste space creating a list of FUN's return
> values but could stuff them directly into the final output structure.
> A list of n scalar doubles is 4.5 times bigger than double(n) and the
> factor is 9.0 for integers and logicals.


What do you think of the behaviour of the sapply function below?  (I
wouldn't put it into R as it is, I'd translate it to C code to avoid the
lapply call; but I'd like to get the behaviour right before doing that.)

This one checks that the length() and typeof() results are consistent.
If the FUN.VALUE has names, those are used (but it doesn't require the
names from FUN to match).

Duncan Murdoch

sapply <- function(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE,
FUN.VALUE)
{
     FUN <- match.fun(FUN)
     answer <- lapply(X, FUN, ...)
     if(USE.NAMES && is.character(X) && is.null(names(answer)))
                 names(answer) <- X
     if(simplify) {
      if (missing(FUN.VALUE)) {
         if ((!length(answer))
             || length(common.len <- unique(unlist(lapply(answer,
length)))) != 1L)
             return(answer)
         common.names <- names(answer[[1L]])
      } else {
         common.len <- length(FUN.VALUE)
         common.type <- typeof(FUN.VALUE)
         common.names <- names(FUN.VALUE)
         if (length(answer)) {
          if (any( unlist(lapply(answer, length)) != common.len ))
             stop(sprintf("%s values must be of length %d", "FUN",
common.len))
          if (any( unlist(lapply(answer, typeof)) != common.type ))
             stop(sprintf("%s values must be of type '%s'", "FUN",
common.type))
          if (is.null(common.names))
             common.names <- names(answer[[1L]])
         } else if (length(FUN.VALUE) > 1)
          return(array(FUN.VALUE[0], dim=c(common.len, 0),
                      dimnames= if(!is.null(common.names))
                                list(common.names,character(0))))
         else
          return(FUN.VALUE[0])
      }
        if(common.len == 1L)
            unlist(answer, recursive = FALSE)
        else if(common.len > 1L)
            array(unlist(answer, recursive = FALSE),
                  dim= c(common.len, length(X)),
                  dimnames= if(!(is.null(common.names) &
                                 is.null(n2 <- names(answer)))) list(common.names,n2))
        else answer
     } else answer
}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Gabor Grothendieck

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
S4 generics can specify a valueClass.  Perhaps that could be used in
those cases.

On Wed, Nov 4, 2009 at 3:24 PM, Duncan Murdoch <[hidden email]> wrote:

> On 11/4/2009 12:15 PM, William Dunlap wrote:
>>>
>>> -----Original Message-----
>>> From: [hidden email]
>>> [mailto:[hidden email]] On Behalf Of Duncan Murdoch
>>> Sent: Wednesday, November 04, 2009 8:47 AM
>>> To: [hidden email]
>>> Cc: [hidden email]; [hidden email]
>>> Subject: Re: [Rd] error in install.packages() (PR#14042)
>>>
>>> On 11/4/2009 11:05 AM, [hidden email] wrote:
>>> > Full_Name: Michael Spiegel
>>> > Version: 2.10
>>> > OS: Windows Vista
>>> > Submission from: (NULL) (76.104.24.156)
>>> > > > The following error is produced when attempting to call
>>> > > > install.packages.  Here
>>> > is the results of the traceback:
>>> > >> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
>>> > Error in f(res) : invalid subscript type 'list'
>>> >> traceback()
>>> > 7: f(res)
>>> > 6: available.packages(contriburl = contriburl, method = method)
>>> > 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl = contriburl,
>>> > >        method = method, available = available, destdir = destdir, >
>>> >  dependencies = dependencies, ...)
>>> > 4: install.packages(pkgs = c("OpenMx"), repos = repos)
>>> > 3: eval.with.vis(expr, envir, enclos)
>>> > 2: eval.with.vis(ei, envir)
>>> > 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
>>> > > I've tracked the error down to somewhere in available.packages
>>> > > defined in
>>> > src\library\utils\R\packages.R.  I am guessing that the error in
>>> > version 2.10
>>> > has something to do with the change: "available.packages() gains a
>>> > 'filters'
>>> > argument for specifying the filtering operations performed on the
>>> > packages found
>>> > in the repositories."
>>>
>>> I've found the error, and will fix and commit to R-devel and R-patched.
>>>
>>> For future reference:  the problem was that it assigned the result of
>>> sapply() to a subset of a vector.  Normally sapply() simplifies its result
>>> to a vector, but in this case the result was empty, so sapply() returned an
>>> empty list; assigning a list to a vector coerced the vector to a list, and
>>> then the "invalid subscript type 'list'" came soon after.
>>
>> I've run into this sort of problem a lot (0-long input to sapply
>> causes it to return list()).  A related problem is that when sapply's
>> FUN doesn't always return the type of value you expect for some
>> corner case then sapply won't do the expected simplication.  If
>> sapply had an argument that gave the expected form of FUN's output
>> then sapply could (a) die if some call to FUN didn't return something
>> of that form and (b) return a 0-long object of the correct form
>> if sapply's X has length zero so FUN is never called.  E.g.,
>>   sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
>> third iteration
>>   sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
>> logical(0)
>>
>> Another benefit of sapply knowing the type of FUN's return value is
>> that it wouldn't have to waste space creating a list of FUN's return
>> values but could stuff them directly into the final output structure.
>> A list of n scalar doubles is 4.5 times bigger than double(n) and the
>> factor is 9.0 for integers and logicals.
>
>
> What do you think of the behaviour of the sapply function below?  (I
> wouldn't put it into R as it is, I'd translate it to C code to avoid the
> lapply call; but I'd like to get the behaviour right before doing that.)
>
> This one checks that the length() and typeof() results are consistent. If
> the FUN.VALUE has names, those are used (but it doesn't require the names
> from FUN to match).
>
> Duncan Murdoch
>
> sapply <- function(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE,
> FUN.VALUE)
> {
>    FUN <- match.fun(FUN)
>    answer <- lapply(X, FUN, ...)
>    if(USE.NAMES && is.character(X) && is.null(names(answer)))
>                names(answer) <- X
>    if(simplify) {
>        if (missing(FUN.VALUE)) {
>            if ((!length(answer))
>                || length(common.len <- unique(unlist(lapply(answer,
> length)))) != 1L)
>                return(answer)
>            common.names <- names(answer[[1L]])
>        } else {
>            common.len <- length(FUN.VALUE)
>            common.type <- typeof(FUN.VALUE)
>            common.names <- names(FUN.VALUE)
>            if (length(answer)) {
>                if (any( unlist(lapply(answer, length)) != common.len ))
>                    stop(sprintf("%s values must be of length %d", "FUN",
> common.len))
>                if (any( unlist(lapply(answer, typeof)) != common.type ))
>                    stop(sprintf("%s values must be of type '%s'", "FUN",
> common.type))
>                if (is.null(common.names))
>                    common.names <- names(answer[[1L]])
>            } else if (length(FUN.VALUE) > 1)
>                return(array(FUN.VALUE[0], dim=c(common.len, 0),
>                             dimnames= if(!is.null(common.names))
>                                       list(common.names,character(0))))
>            else
>                return(FUN.VALUE[0])
>        }
>        if(common.len == 1L)
>            unlist(answer, recursive = FALSE)
>        else if(common.len > 1L)
>            array(unlist(answer, recursive = FALSE),
>                  dim= c(common.len, length(X)),
>                  dimnames= if(!(is.null(common.names) &
>                                 is.null(n2 <- names(answer))))
> list(common.names,n2))
>        else answer
>    } else answer
> }
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
William Dunlap

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
In reply to this post by Duncan Murdoch
> -----Original Message-----
> From: William Dunlap
> Sent: Wednesday, November 04, 2009 12:53 PM
> To: 'Duncan Murdoch'
> Cc: [hidden email]
> Subject: RE: sapply improvements
>
> It looks good on following examples:
>
> > z <- split(log(1:10), rep(letters[1:2],c(3,7)))
> > sapply(z, length, FUN.VALUE=numeric(1))
> Error in sapply(z, length, FUN.VALUE = numeric(1)) :
>   FUN values must be of type 'double'
>
> (I'd like the error to say "... must be of type 'double',
> not 'integer'", to give the user a fuller diagnosis of
> the problem.)

If this new argument gets used much it may give a
push towards getting functions to always return the
same type of output.  E.g., range(integer(0)) returns
a numeric while range(integer(1)) an integer, resulting
in:
   > z<-split(1:10, cut(log(1:10),breaks=0:4,include.lowest=TRUE))
   > # z[[4]] is integer(0)
   > sapply(z,range,FUN.VALUE=integer(2))
   Error in sapply(z, range, FUN.VALUE = integer(2)) :
     FUN values must be of type 'integer'
   In addition: Warning messages:
   1: In min(x) : no non-missing arguments to min; returning Inf
   2: In max(x) : no non-missing arguments to max; returning -Inf

>
> > sapply(z, range, FUN.VALUE=c(Min=0,Max=0))
>            a        b
> Min 0.000000 1.386294
> Max 1.098612 2.302585
>
> Exactly matching the typeof's and using the names
> for row.names on matrix output seem good to me.
>  
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com  
>
> > -----Original Message-----
> > From: Duncan Murdoch [mailto:[hidden email]]
> > Sent: Wednesday, November 04, 2009 12:24 PM
> > To: William Dunlap
> > Cc: [hidden email]; [hidden email]
> > Subject: sapply improvements
> >
> > On 11/4/2009 12:15 PM, William Dunlap wrote:
> > >> -----Original Message-----
> > >> From: [hidden email]
> > >> [mailto:[hidden email]] On Behalf Of
> Duncan Murdoch
> > >> Sent: Wednesday, November 04, 2009 8:47 AM
> > >> To: [hidden email]
> > >> Cc: [hidden email]; [hidden email]
> > >> Subject: Re: [Rd] error in install.packages() (PR#14042)
> > >>
> ...
> > >> For future reference:  the problem was that it assigned
> > the result of
> > >> sapply() to a subset of a vector.  Normally sapply()
> > simplifies its
> > >> result to a vector, but in this case the result was empty, so
> > >> sapply()
> > >> returned an empty list; assigning a list to a vector coerced
> > >> the vector
> > >> to a list, and then the "invalid subscript type 'list'" came
> > >> soon after.
> > >
> > > I've run into this sort of problem a lot (0-long input to sapply
> > > causes it to return list()).  A related problem is that
> > when sapply's
> > > FUN doesn't always return the type of value you expect for some
> > > corner case then sapply won't do the expected simplication.  If
> > > sapply had an argument that gave the expected form of FUN's output
> > > then sapply could (a) die if some call to FUN didn't return
> > something
> > > of that form and (b) return a 0-long object of the correct form
> > > if sapply's X has length zero so FUN is never called.  E.g.,
> > >    sapply(2:0, function(i)(11:20)[i],
> FUN.VALUE=integer(1)) # die on
> > > third iteration
> > >    sapply(integer(0), function(i)i>0,
> FUN.VALUE=logical(1)) # return
> > > logical(0)
> > >
> > > Another benefit of sapply knowing the type of FUN's
> return value is
> > > that it wouldn't have to waste space creating a list of
> FUN's return
> > > values but could stuff them directly into the final output
> > structure.
> > > A list of n scalar doubles is 4.5 times bigger than
> > double(n) and the
> > > factor is 9.0 for integers and logicals.
> >
> >
> > What do you think of the behaviour of the sapply function
> below?  (I
> > wouldn't put it into R as it is, I'd translate it to C code
> > to avoid the
> > lapply call; but I'd like to get the behaviour right before
> > doing that.)
> >
> > This one checks that the length() and typeof() results are
> > consistent.
> > If the FUN.VALUE has names, those are used (but it doesn't
> > require the
> > names from FUN to match).
> ...

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
William Dunlap

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
In reply to this post by Duncan Murdoch
It looks good on following examples:

> z <- split(log(1:10), rep(letters[1:2],c(3,7)))
> sapply(z, length, FUN.VALUE=numeric(1))
Error in sapply(z, length, FUN.VALUE = numeric(1)) :
  FUN values must be of type 'double'

(I'd like the error to say "... must be of type 'double',
not 'integer'", to give the user a fuller diagnosis of
the problem.)

> sapply(z, range, FUN.VALUE=c(Min=0,Max=0))
           a        b
Min 0.000000 1.386294
Max 1.098612 2.302585

Exactly matching the typeof's and using the names
for row.names on matrix output seem good to me.
 
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: Duncan Murdoch [mailto:[hidden email]]
> Sent: Wednesday, November 04, 2009 12:24 PM
> To: William Dunlap
> Cc: [hidden email]; [hidden email]
> Subject: sapply improvements
>
> On 11/4/2009 12:15 PM, William Dunlap wrote:
> >> -----Original Message-----
> >> From: [hidden email]
> >> [mailto:[hidden email]] On Behalf Of Duncan Murdoch
> >> Sent: Wednesday, November 04, 2009 8:47 AM
> >> To: [hidden email]
> >> Cc: [hidden email]; [hidden email]
> >> Subject: Re: [Rd] error in install.packages() (PR#14042)
> >>
...

> >> For future reference:  the problem was that it assigned
> the result of
> >> sapply() to a subset of a vector.  Normally sapply()
> simplifies its
> >> result to a vector, but in this case the result was empty, so
> >> sapply()
> >> returned an empty list; assigning a list to a vector coerced
> >> the vector
> >> to a list, and then the "invalid subscript type 'list'" came
> >> soon after.
> >
> > I've run into this sort of problem a lot (0-long input to sapply
> > causes it to return list()).  A related problem is that
> when sapply's
> > FUN doesn't always return the type of value you expect for some
> > corner case then sapply won't do the expected simplication.  If
> > sapply had an argument that gave the expected form of FUN's output
> > then sapply could (a) die if some call to FUN didn't return
> something
> > of that form and (b) return a 0-long object of the correct form
> > if sapply's X has length zero so FUN is never called.  E.g.,
> >    sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
> > third iteration
> >    sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
> > logical(0)
> >
> > Another benefit of sapply knowing the type of FUN's return value is
> > that it wouldn't have to waste space creating a list of FUN's return
> > values but could stuff them directly into the final output
> structure.
> > A list of n scalar doubles is 4.5 times bigger than
> double(n) and the
> > factor is 9.0 for integers and logicals.
>
>
> What do you think of the behaviour of the sapply function below?  (I
> wouldn't put it into R as it is, I'd translate it to C code
> to avoid the
> lapply call; but I'd like to get the behaviour right before
> doing that.)
>
> This one checks that the length() and typeof() results are
> consistent.
> If the FUN.VALUE has names, those are used (but it doesn't
> require the
> names from FUN to match).
...

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Peter Dalgaard

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
William Dunlap wrote:

> It looks good on following examples:
>
>> z <- split(log(1:10), rep(letters[1:2],c(3,7)))
>> sapply(z, length, FUN.VALUE=numeric(1))
> Error in sapply(z, length, FUN.VALUE = numeric(1)) :
>   FUN values must be of type 'double'
>
> (I'd like the error to say "... must be of type 'double',
> not 'integer'", to give the user a fuller diagnosis of
> the problem.)

Umm, not following too closely, but would it not be preferable just to
coerce in such cases? I can see a lot of issues of the

if (x <= 0) NA else log(x)

variety otherwise.

>> sapply(z, range, FUN.VALUE=c(Min=0,Max=0))
>            a        b
> Min 0.000000 1.386294
> Max 1.098612 2.302585
>
> Exactly matching the typeof's and using the names
> for row.names on matrix output seem good to me.
>  
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com  
>
>> -----Original Message-----
>> From: Duncan Murdoch [mailto:[hidden email]]
>> Sent: Wednesday, November 04, 2009 12:24 PM
>> To: William Dunlap
>> Cc: [hidden email]; [hidden email]
>> Subject: sapply improvements
>>
>> On 11/4/2009 12:15 PM, William Dunlap wrote:
>>>> -----Original Message-----
>>>> From: [hidden email]
>>>> [mailto:[hidden email]] On Behalf Of Duncan Murdoch
>>>> Sent: Wednesday, November 04, 2009 8:47 AM
>>>> To: [hidden email]
>>>> Cc: [hidden email]; [hidden email]
>>>> Subject: Re: [Rd] error in install.packages() (PR#14042)
>>>>
> ...
>>>> For future reference:  the problem was that it assigned
>> the result of
>>>> sapply() to a subset of a vector.  Normally sapply()
>> simplifies its
>>>> result to a vector, but in this case the result was empty, so
>>>> sapply()
>>>> returned an empty list; assigning a list to a vector coerced
>>>> the vector
>>>> to a list, and then the "invalid subscript type 'list'" came
>>>> soon after.
>>> I've run into this sort of problem a lot (0-long input to sapply
>>> causes it to return list()).  A related problem is that
>> when sapply's
>>> FUN doesn't always return the type of value you expect for some
>>> corner case then sapply won't do the expected simplication.  If
>>> sapply had an argument that gave the expected form of FUN's output
>>> then sapply could (a) die if some call to FUN didn't return
>> something
>>> of that form and (b) return a 0-long object of the correct form
>>> if sapply's X has length zero so FUN is never called.  E.g.,
>>>    sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
>>> third iteration
>>>    sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
>>> logical(0)
>>>
>>> Another benefit of sapply knowing the type of FUN's return value is
>>> that it wouldn't have to waste space creating a list of FUN's return
>>> values but could stuff them directly into the final output
>> structure.
>>> A list of n scalar doubles is 4.5 times bigger than
>> double(n) and the
>>> factor is 9.0 for integers and logicals.
>>
>> What do you think of the behaviour of the sapply function below?  (I
>> wouldn't put it into R as it is, I'd translate it to C code
>> to avoid the
>> lapply call; but I'd like to get the behaviour right before
>> doing that.)
>>
>> This one checks that the length() and typeof() results are
>> consistent.
>> If the FUN.VALUE has names, those are used (but it doesn't
>> require the
>> names from FUN to match).
> ...
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


--
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - ([hidden email])              FAX: (+45) 35327907

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
William Dunlap

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
> -----Original Message-----
> From: Peter Dalgaard [mailto:[hidden email]]
> Sent: Wednesday, November 04, 2009 1:16 PM
> To: William Dunlap
> Cc: Duncan Murdoch; [hidden email]
> Subject: Re: [Rd] sapply improvements
>
> William Dunlap wrote:
> > It looks good on following examples:
> >
> >> z <- split(log(1:10), rep(letters[1:2],c(3,7)))
> >> sapply(z, length, FUN.VALUE=numeric(1))
> > Error in sapply(z, length, FUN.VALUE = numeric(1)) :
> >   FUN values must be of type 'double'
> >
> > (I'd like the error to say "... must be of type 'double',
> > not 'integer'", to give the user a fuller diagnosis of
> > the problem.)
>
> Umm, not following too closely, but would it not be
> preferable just to
> coerce in such cases? I can see a lot of issues of the
>
> if (x <= 0) NA else log(x)
>
> variety otherwise.

Would you only want it to coerce upwards to FUN.VALUES's
type?  E.g., allow
   sapply(z, length, FUN.VALUE=numeric(1))
to return a numeric vector but die on
   sapply(z, function(zi)as.complex(zi[1]), FUN.VALUE=numeric(1))
If the latter doesn't die should it return
a complex or a numeric vector?  (I'd say it
needs to be numeric, but I'd prefer that it
died.)
 
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

>
> >> sapply(z, range, FUN.VALUE=c(Min=0,Max=0))
> >            a        b
> > Min 0.000000 1.386294
> > Max 1.098612 2.302585
> >
> > Exactly matching the typeof's and using the names
> > for row.names on matrix output seem good to me.
> >  
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com  
> >
> >> -----Original Message-----
> >> From: Duncan Murdoch [mailto:[hidden email]]
> >> Sent: Wednesday, November 04, 2009 12:24 PM
> >> To: William Dunlap
> >> Cc: [hidden email]; [hidden email]
> >> Subject: sapply improvements
> >>
> >> On 11/4/2009 12:15 PM, William Dunlap wrote:
> >>>> -----Original Message-----
> >>>> From: [hidden email]
> >>>> [mailto:[hidden email]] On Behalf Of
> Duncan Murdoch
> >>>> Sent: Wednesday, November 04, 2009 8:47 AM
> >>>> To: [hidden email]
> >>>> Cc: [hidden email]; [hidden email]
> >>>> Subject: Re: [Rd] error in install.packages() (PR#14042)
> >>>>
> > ...
> >>>> For future reference:  the problem was that it assigned
> >> the result of
> >>>> sapply() to a subset of a vector.  Normally sapply()
> >> simplifies its
> >>>> result to a vector, but in this case the result was empty, so
> >>>> sapply()
> >>>> returned an empty list; assigning a list to a vector coerced
> >>>> the vector
> >>>> to a list, and then the "invalid subscript type 'list'" came
> >>>> soon after.
> >>> I've run into this sort of problem a lot (0-long input to sapply
> >>> causes it to return list()).  A related problem is that
> >> when sapply's
> >>> FUN doesn't always return the type of value you expect for some
> >>> corner case then sapply won't do the expected simplication.  If
> >>> sapply had an argument that gave the expected form of FUN's output
> >>> then sapply could (a) die if some call to FUN didn't return
> >> something
> >>> of that form and (b) return a 0-long object of the correct form
> >>> if sapply's X has length zero so FUN is never called.  E.g.,
> >>>    sapply(2:0, function(i)(11:20)[i],
> FUN.VALUE=integer(1)) # die on
> >>> third iteration
> >>>    sapply(integer(0), function(i)i>0,
> FUN.VALUE=logical(1)) # return
> >>> logical(0)
> >>>
> >>> Another benefit of sapply knowing the type of FUN's
> return value is
> >>> that it wouldn't have to waste space creating a list of
> FUN's return
> >>> values but could stuff them directly into the final output
> >> structure.
> >>> A list of n scalar doubles is 4.5 times bigger than
> >> double(n) and the
> >>> factor is 9.0 for integers and logicals.
> >>
> >> What do you think of the behaviour of the sapply function
> below?  (I
> >> wouldn't put it into R as it is, I'd translate it to C code
> >> to avoid the
> >> lapply call; but I'd like to get the behaviour right before
> >> doing that.)
> >>
> >> This one checks that the length() and typeof() results are
> >> consistent.
> >> If the FUN.VALUE has names, those are used (but it doesn't
> >> require the
> >> names from FUN to match).
> > ...
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> --
>     O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>    c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>   (*) \(*) -- University of Copenhagen   Denmark      Ph:  
> (+45) 35327918
> ~~~~~~~~~~ - ([hidden email])              FAX:
> (+45) 35327907
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Peter Dalgaard

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
William Dunlap wrote:
...

>>
>> if (x <= 0) NA else log(x)
>>
>> variety otherwise.
>
> Would you only want it to coerce upwards to FUN.VALUES's
> type?  E.g., allow
>    sapply(z, length, FUN.VALUE=numeric(1))
> to return a numeric vector but die on
>    sapply(z, function(zi)as.complex(zi[1]), FUN.VALUE=numeric(1))
> If the latter doesn't die should it return
> a complex or a numeric vector?  (I'd say it
> needs to be numeric, but I'd prefer that it
> died.)

I'd say that it should probably die on downwards coercion. Getting a
double when an integer is expected, or complex instead of double as you
indicate, is a likely user error. If not, then the user can always
coerce explicitly inside FUN.

Another issue is whether one would want to go beyond the base classes of
  S (logical, integer, double, complex, character). For other classes,
there may be no notion of "up" and "down" in coercion. Then again,
sapply was always limited to what unlist() will handle, so e.g.

 > sapply(1:10,FUN=function(i)Sys.Date())
  [1] 14553 14553 14553 14553 14553 14553 14553 14553 14553 14553

as opposed to

 > structure(rep(14553,10), class="Date")
  [1] "2009-11-05" "2009-11-05" "2009-11-05" "2009-11-05" "2009-11-05"
  [6] "2009-11-05" "2009-11-05" "2009-11-05" "2009-11-05" "2009-11-05"




--
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - ([hidden email])              FAX: (+45) 35327907

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Martin Maechler

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
>>>>> "PD" == Peter Dalgaard <[hidden email]>
>>>>>     on Thu, 05 Nov 2009 00:28:51 +0100 writes:

    PD> William Dunlap wrote: ...
    >>>
    >>> if (x <= 0) NA else log(x)
    >>>
    >>> variety otherwise.
    >>
    >> Would you only want it to coerce upwards to FUN.VALUES's
    >> type?  E.g., allow sapply(z, length,
    >> FUN.VALUE=numeric(1)) to return a numeric vector but die
    >> on sapply(z, function(zi)as.complex(zi[1]),
    >> FUN.VALUE=numeric(1)) If the latter doesn't die should it
    >> return a complex or a numeric vector?  (I'd say it needs
    >> to be numeric, but I'd prefer that it died.)

    PD> I'd say that it should probably die on downwards
    PD> coercion. Getting a double when an integer is expected,
    PD> or complex instead of double as you indicate, is a
    PD> likely user error. If not, then the user can always
    PD> coerce explicitly inside FUN.

I agree with Peter: Do allow coercion downwards

    PD> Another issue is whether one would want to go beyond the
    PD> base classes of S (logical, integer, double, complex,
    PD> character). For other classes, there may be no notion of
    PD> "up" and "down" in coercion. Then again, sapply was
    PD> always limited to what unlist() will handle, so e.g.

    >> sapply(1:10,FUN=function(i)Sys.Date())
    PD>   [1] 14553 14553 14553 14553 14553 14553 14553 14553
    PD> 14553 14553

    PD> as opposed to

    >> structure(rep(14553,10), class="Date")
    PD>   [1] "2009-11-05" "2009-11-05" "2009-11-05"
    PD> "2009-11-05" "2009-11-05" [6] "2009-11-05" "2009-11-05"
    PD> "2009-11-05" "2009-11-05" "2009-11-05"

Well, using    
      as(<prelim_result>,  class(<prototype>) )

would be really nice here....
but alas, we are still not allowed to use  as(.,.) in base
code which I'd tend to call  a "design bug" nowadays..

Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Duncan Murdoch

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
On 11/5/2009 4:05 AM, Martin Maechler wrote:

>>>>>> "PD" == Peter Dalgaard <[hidden email]>
>>>>>>     on Thu, 05 Nov 2009 00:28:51 +0100 writes:
>
>     PD> William Dunlap wrote: ...
>     >>>
>     >>> if (x <= 0) NA else log(x)
>     >>>
>     >>> variety otherwise.
>     >>
>     >> Would you only want it to coerce upwards to FUN.VALUES's
>     >> type?  E.g., allow sapply(z, length,
>     >> FUN.VALUE=numeric(1)) to return a numeric vector but die
>     >> on sapply(z, function(zi)as.complex(zi[1]),
>     >> FUN.VALUE=numeric(1)) If the latter doesn't die should it
>     >> return a complex or a numeric vector?  (I'd say it needs
>     >> to be numeric, but I'd prefer that it died.)
>
>     PD> I'd say that it should probably die on downwards
>     PD> coercion. Getting a double when an integer is expected,
>     PD> or complex instead of double as you indicate, is a
>     PD> likely user error. If not, then the user can always
>     PD> coerce explicitly inside FUN.
>
> I agree with Peter: Do allow coercion downwards
>
>     PD> Another issue is whether one would want to go beyond the
>     PD> base classes of S (logical, integer, double, complex,
>     PD> character). For other classes, there may be no notion of
>     PD> "up" and "down" in coercion. Then again, sapply was
>     PD> always limited to what unlist() will handle, so e.g.
>
>     >> sapply(1:10,FUN=function(i)Sys.Date())
>     PD>   [1] 14553 14553 14553 14553 14553 14553 14553 14553
>     PD> 14553 14553
>
>     PD> as opposed to
>
>     >> structure(rep(14553,10), class="Date")
>     PD>   [1] "2009-11-05" "2009-11-05" "2009-11-05"
>     PD> "2009-11-05" "2009-11-05" [6] "2009-11-05" "2009-11-05"
>     PD> "2009-11-05" "2009-11-05" "2009-11-05"
>
> Well, using    
>       as(<prelim_result>,  class(<prototype>) )
>
> would be really nice here....
> but alas, we are still not allowed to use  as(.,.) in base
> code which I'd tend to call  a "design bug" nowadays..

Part of the difficulty here is that we have too many concepts of "class"
and "type" in R.  For example, as() is not consistent with as.vector()
in the following sense:

If neither input is an S4 object, we should have

as(<prelim_result>,  class(<prototype>) )

be the same as

as.vector(<prelim_result>, typeof(<prototype>))

and

as.vector(<prelim_result>, class(<prototype>))

and currently as() gives a different result.  For example,

 > str(as(1:10, class(double(1))))
  int [1:10] 1 2 3 4 5 6 7 8 9 10
 > str(as.vector(1:10, typeof(double(1))))
  num [1:10] 1 2 3 4 5 6 7 8 9 10
 > str(as.vector(1:10, class(double(1))))
  num [1:10] 1 2 3 4 5 6 7 8 9 10

So if the coercion were to support as(), we'd need to decide when to
follow its rules, and when to follow the existing as.vector() rules
(which I think we're more or less following in the current sapply()).

We'd also need to handle the cases involving S4 objects:

I'd say if the prototype is not S4 but the result is, we should die with
an error.

If the prototype is S4, then we should use as().  We have fast C code to
detect S4 objects, do we have C code to do the coercion?  I'd rather not
write it, but I wouldn't object if someone else did/already has.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Duncan Murdoch

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
In reply to this post by Martin Maechler
On 05/11/2009 4:05 AM, Martin Maechler wrote:

>>>>>> "PD" == Peter Dalgaard <[hidden email]>
>>>>>>     on Thu, 05 Nov 2009 00:28:51 +0100 writes:
>
>     PD> William Dunlap wrote: ...
>     >>>
>     >>> if (x <= 0) NA else log(x)
>     >>>
>     >>> variety otherwise.
>     >>
>     >> Would you only want it to coerce upwards to FUN.VALUES's
>     >> type?  E.g., allow sapply(z, length,
>     >> FUN.VALUE=numeric(1)) to return a numeric vector but die
>     >> on sapply(z, function(zi)as.complex(zi[1]),
>     >> FUN.VALUE=numeric(1)) If the latter doesn't die should it
>     >> return a complex or a numeric vector?  (I'd say it needs
>     >> to be numeric, but I'd prefer that it died.)
>
>     PD> I'd say that it should probably die on downwards
>     PD> coercion. Getting a double when an integer is expected,
>     PD> or complex instead of double as you indicate, is a
>     PD> likely user error. If not, then the user can always
>     PD> coerce explicitly inside FUN.
>
> I agree with Peter: Do allow coercion downwards

You missed "not", right?  I.e. we would never coerce a double down to an
integer or logical, but coercion in the other direction would be fine?

Duncan Murdoch


>
>     PD> Another issue is whether one would want to go beyond the
>     PD> base classes of S (logical, integer, double, complex,
>     PD> character). For other classes, there may be no notion of
>     PD> "up" and "down" in coercion. Then again, sapply was
>     PD> always limited to what unlist() will handle, so e.g.
>
>     >> sapply(1:10,FUN=function(i)Sys.Date())
>     PD>   [1] 14553 14553 14553 14553 14553 14553 14553 14553
>     PD> 14553 14553
>
>     PD> as opposed to
>
>     >> structure(rep(14553,10), class="Date")
>     PD>   [1] "2009-11-05" "2009-11-05" "2009-11-05"
>     PD> "2009-11-05" "2009-11-05" [6] "2009-11-05" "2009-11-05"
>     PD> "2009-11-05" "2009-11-05" "2009-11-05"
>
> Well, using    
>       as(<prelim_result>,  class(<prototype>) )
>
> would be really nice here....
> but alas, we are still not allowed to use  as(.,.) in base
> code which I'd tend to call  a "design bug" nowadays..
>
> Martin
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Martin Maechler

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
>>>>> Duncan Murdoch <[hidden email]>
>>>>>     on Thu, 05 Nov 2009 06:24:24 -0500 writes:

    > On 05/11/2009 4:05 AM, Martin Maechler wrote:
    >>>>>>> "PD" == Peter Dalgaard <[hidden email]> on
    >>>>>>> Thu, 05 Nov 2009 00:28:51 +0100 writes:
    >>
    PD> William Dunlap wrote: ...
    >> >>>
    >> >>> if (x <= 0) NA else log(x)
    >> >>>
    >> >>> variety otherwise.
    >> >>
    >> >> Would you only want it to coerce upwards to
    >> FUN.VALUES's >> type?  E.g., allow sapply(z, length, >>
    >> FUN.VALUE=numeric(1)) to return a numeric vector but die
    >> >> on sapply(z, function(zi)as.complex(zi[1]), >>
    >> FUN.VALUE=numeric(1)) If the latter doesn't die should it
    >> >> return a complex or a numeric vector?  (I'd say it
    >> needs >> to be numeric, but I'd prefer that it died.)
    >>
    PD> I'd say that it should probably die on downwards
    PD> coercion. Getting a double when an integer is expected,
    PD> or complex instead of double as you indicate, is a
    PD> likely user error. If not, then the user can always
    PD> coerce explicitly inside FUN.
    >>
    >> I agree with Peter: Do allow coercion downwards

    > You missed "not", right?  I.e. we would never coerce a
    > double down to an integer or logical, but coercion in the
    > other direction would be fine?

Yes, indeed. I missed and you are right.
Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Duncan Murdoch

Re: sapply improvements

Reply Threaded More More options
Print post
Permalink
In reply to this post by William Dunlap
I've just added a version of the improvements to R-devel.  After some
discussion, we decided this should be a new function, not just a new arg
to sapply, and I chose vapply from among the 52 different suggestions
that were offered.

The usage is

  vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)

and some extracts from the docs say:


      ‘vapply’ is similar to ‘sapply’, but has a pre-specified
      type of return value, so it can be safer (and sometimes faster) to
      use.

...

      Simplification is always done in ‘vapply’.  This function
      checks that all values of ‘FUN’ are compatible with the
      ‘FUN.VALUE’, in that they must have the same length and type.
      (Types may be promoted to a higher type within the ordering
      logical < integer < real < complex, but not demoted.)

...
      ‘vapply’ returns a vector or matrix of type matching the
      ‘FUN.VALUE’. If ‘length(FUN.VALUE) != 1’ a matrix will be
      returned with ‘length(FUN.VALUE)’ rows and ‘length(X)’
      columns, otherwise a vector of the same length as ‘X’.  Names
      of rows in the matrix value are taken from the ‘FUN.VALUE’ if
      it is named, otherwise from the result of the first function call.
      Column names of the matrix value or names of the vector value are
      set from ‘X’ as in ‘sapply’.

Thanks for suggesting this.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel