Outputing multilple subsets

7 messages Options
Embed this post
Permalink
zhijie zhang-2

Outputing multilple subsets

Reply Threaded More More options
Print post
Permalink
Hi Rusers,
  I hope to divide the original dataset into several subsets and output
these multilple datasets. But errors appeared in my loops. See example.
######
a<-c(1:10)
b<-c(rep(1,3),rep(2,3),rep(3,4))
c<-data.frame(a,b)  #c is the example data
num<-c(unique(b))
# I hope to get the subsets c_1.csv,c_2.csv and c_3.csv
#Errors
for (i in num)  {
   c_num<-c[c$b==num,]
   write.csv(c_num,file="c:/c_num.csv")
}

Warning messages:
1: In c$b == num :
  longer object length is not a multiple of shorter object length
2: In c$b == num :
  longer object length is not a multiple of shorter object length
3: In c$b == num :
  longer object length is not a multiple of shorter object length
  I think the problem should be file="c:/c_num.csv", anybody has ever met
this problem?
  Thanks very much.
-----------------
Jane Chang
Queen's

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Ista Zahn

Re: Outputing multilple subsets

Reply Threaded More More options
Print post
Permalink
Have you considered using split?

-Ista

On Sun, Nov 8, 2009 at 7:23 PM, rusers.sh <[hidden email]> wrote:

> Hi Rusers,
>  I hope to divide the original dataset into several subsets and output
> these multilple datasets. But errors appeared in my loops. See example.
> ######
> a<-c(1:10)
> b<-c(rep(1,3),rep(2,3),rep(3,4))
> c<-data.frame(a,b)  #c is the example data
> num<-c(unique(b))
> # I hope to get the subsets c_1.csv,c_2.csv and c_3.csv
> #Errors
> for (i in num)  {
>   c_num<-c[c$b==num,]
>   write.csv(c_num,file="c:/c_num.csv")
> }
>
> Warning messages:
> 1: In c$b == num :
>  longer object length is not a multiple of shorter object length
> 2: In c$b == num :
>  longer object length is not a multiple of shorter object length
> 3: In c$b == num :
>  longer object length is not a multiple of shorter object length
>  I think the problem should be file="c:/c_num.csv", anybody has ever met
> this problem?
>  Thanks very much.
> -----------------
> Jane Chang
> Queen's
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Johann Hibschman

Re: Outputing multilple subsets

Reply Threaded More More options
Print post
Permalink
In reply to this post by zhijie zhang-2
On Nov 8, 2009, at 7:23 PM, rusers.sh wrote:

> for (i in num)  {
>   c_num<-c[c$b==num,]
>   write.csv(c_num,file="c:/c_num.csv")
> }
>
> Warning messages:
> 1: In c$b == num :
>  longer object length is not a multiple of shorter object length

This is because you're comparing column b to the entire vector of  
numbers (num), not the current number in the iteration (i).  The first  
line of the loop should be "c_num<-c[c$b==i,]".

 From a style point of view, I'd use "n" as my variable, since "i" is  
too commonly used as an integer index.

Also, you will be overwriting the same file, called "c_num.csv", on  
each iteration.

You should try something more like:

for (n in num) {
   c.n <- c[c$b==n,]
   write.csv(c.n, file=paste("c:/c_", n, ".csv", sep="")
}

I hope that helps.

Cheers,
Johann Hibschman

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
zhijie zhang-2

Re: Outputing multilple subsets

Reply Threaded More More options
Print post
Permalink
Hi Johann,
 Excellent. That is what i really want. A little problem is why the "c.n"
does not exist. Should the "c.n" in the memory? Sometimes, i also hope to
see "c.n" directly in R besides exporting. Could i see the  "c.n" with some
function in the loops?
> a<-c(1:10)
> b<-c(rep(1,3),rep(2,3),rep(3,4))
> c<-data.frame(a,b)  #c is the example data
> num<-c(unique(b))
> for (n in num) {
+  c.n <- c[c$b==n,]
+  write.csv(c.n, file=paste("c:/c_", n, ".csv", sep=""))}
> num
[1] 1 2 3
> c.1
Error: object 'c.1' not found
> c.2
Error: object 'c.2' not found
> c.3
Error: object 'c.3' not found

 Thanks a lot.
-----------------
Jane Chang
Queen's



2009/11/9 Johann Hibschman <[hidden email]>

> On Nov 8, 2009, at 7:23 PM, rusers.sh wrote:
>
> for (i in num)  {
>>  c_num<-c[c$b==num,]
>>  write.csv(c_num,file="c:/c_num.csv")
>> }
>>
>> Warning messages:
>> 1: In c$b == num :
>>  longer object length is not a multiple of shorter object length
>>
>
> This is because you're comparing column b to the entire vector of numbers
> (num), not the current number in the iteration (i).  The first line of the
> loop should be "c_num<-c[c$b==i,]".
>
> From a style point of view, I'd use "n" as my variable, since "i" is too
> commonly used as an integer index.
>
> Also, you will be overwriting the same file, called "c_num.csv", on each
> iteration.
>
> You should try something more like:
>
> for (n in num) {
>  c.n <- c[c$b==n,]
>  write.csv(c.n, file=paste("c:/c_", n, ".csv", sep="")
> }
>
> I hope that helps.
>
> Cheers,
> Johann Hibschman
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
jholtman

Re: Outputing multilple subsets

Reply Threaded More More options
Print post
Permalink
You only have the object 'c.n' in the loop.  c.1, c.2 & c.3 are not
created.  You have some file names by a similar name.  If you want to
keep the values from the loop, use lapply:

result <- lapply(seq(num), function(.num){
    c.n <- c[c$b == .num,, drop=FALSE]  # use 'drop' in case there is
only one row
    write.csv(c.n, file=paste("c:/c_", .num, ".csv", sep="")
    c.n  # return the value
})
# 'result' will now be a list of each of the subsets

On Mon, Nov 9, 2009 at 8:45 AM, rusers.sh <[hidden email]> wrote:

> Hi Johann,
>  Excellent. That is what i really want. A little problem is why the "c.n"
> does not exist. Should the "c.n" in the memory? Sometimes, i also hope to
> see "c.n" directly in R besides exporting. Could i see the  "c.n" with some
> function in the loops?
>> a<-c(1:10)
>> b<-c(rep(1,3),rep(2,3),rep(3,4))
>> c<-data.frame(a,b)  #c is the example data
>> num<-c(unique(b))
>> for (n in num) {
> +  c.n <- c[c$b==n,]
> +  write.csv(c.n, file=paste("c:/c_", n, ".csv", sep=""))}
>> num
> [1] 1 2 3
>> c.1
> Error: object 'c.1' not found
>> c.2
> Error: object 'c.2' not found
>> c.3
> Error: object 'c.3' not found
>
>  Thanks a lot.
> -----------------
> Jane Chang
> Queen's
>
>
>
> 2009/11/9 Johann Hibschman <[hidden email]>
>
>> On Nov 8, 2009, at 7:23 PM, rusers.sh wrote:
>>
>> for (i in num)  {
>>>  c_num<-c[c$b==num,]
>>>  write.csv(c_num,file="c:/c_num.csv")
>>> }
>>>
>>> Warning messages:
>>> 1: In c$b == num :
>>>  longer object length is not a multiple of shorter object length
>>>
>>
>> This is because you're comparing column b to the entire vector of numbers
>> (num), not the current number in the iteration (i).  The first line of the
>> loop should be "c_num<-c[c$b==i,]".
>>
>> From a style point of view, I'd use "n" as my variable, since "i" is too
>> commonly used as an integer index.
>>
>> Also, you will be overwriting the same file, called "c_num.csv", on each
>> iteration.
>>
>> You should try something more like:
>>
>> for (n in num) {
>>  c.n <- c[c$b==n,]
>>  write.csv(c.n, file=paste("c:/c_", n, ".csv", sep="")
>> }
>>
>> I hope that helps.
>>
>> Cheers,
>> Johann Hibschman
>>
>>
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius

Re: Outputing multilple subsets

Reply Threaded More More options
Print post
Permalink
In reply to this post by zhijie zhang-2

On Nov 9, 2009, at 8:45 AM, rusers.sh wrote:

> Hi Johann,
> Excellent. That is what i really want. A little problem is why the  
> "c.n"
> does not exist. Should the "c.n" in the memory? Sometimes, i also  
> hope to
> see "c.n" directly in R besides exporting. Could i see the  "c.n"  
> with some
> function in the loops?

>> a<-c(1:10)
>> b<-c(rep(1,3),rep(2,3),rep(3,4))
>> c<-data.frame(a,b)  #c is the example data

And not a particularly good choice for a variable name by virtue of  
potential "wetware confusion" with the concatenate function, c(.)


>> num<-c(unique(b))
>> for (n in num) {
> +  c.n <- c[c$b==n,]
> +  write.csv(c.n, file=paste("c:/c_", n, ".csv", sep=""))}

>> num
> [1] 1 2 3
>> c.1
> Error: object 'c.1' not found

And you were apparently expecting variables "c.1", "c.2", and "c.3" to  
be constructed in that loop? That is way beyond the R-interpreter's  
currently level of integration with the device drivers reading input  
from the electroencephalograph that must be sitting on your machine.

Perhaps you could have succeeded with:

dftemp <- list() # outside the loop, need a list because results of  
the extract operation will be a df.
..........
dftemp[[n]] <- c[c$b == n, ]  # inside the loop
   write.csv(dftemp[[n]], file=paste("c:/c_", n, ".csv", sep=""))}

The fact that you immediately wrote it to a file that did not store  
its name would make creation of a list unnecessary inside the loop,  
but it would store the results in a form that could be examined later  
from the command line.


>> c.2
> Error: object 'c.2' not found
>> c.3
> Error: object 'c.3' not found
>
> Thanks a lot.
> -----------------
> Jane Chang
> Queen's
>
>
>
> 2009/11/9 Johann Hibschman <[hidden email]>
>
>> On Nov 8, 2009, at 7:23 PM, rusers.sh wrote:
>>
>> for (i in num)  {
>>> c_num<-c[c$b==num,]
>>> write.csv(c_num,file="c:/c_num.csv")
>>> }
>>>
>>> Warning messages:
>>> 1: In c$b == num :
>>> longer object length is not a multiple of shorter object length
>>>
>>
>> This is because you're comparing column b to the entire vector of  
>> numbers
>> (num), not the current number in the iteration (i).  The first line  
>> of the
>> loop should be "c_num<-c[c$b==i,]".
>>
>> From a style point of view, I'd use "n" as my variable, since "i"  
>> is too
>> commonly used as an integer index.
>>
>> Also, you will be overwriting the same file, called "c_num.csv", on  
>> each
>> iteration.
>>
>> You should try something more like:
>>
>> for (n in num) {
>> c.n <- c[c$b==n,]
>> write.csv(c.n, file=paste("c:/c_", n, ".csv", sep="")
>> }
>>
>> I hope that helps.
>>
>> Cheers,
>> Johann Hibschman
>>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
zhijie zhang-2

Re: Outputing multilple subsets

Reply Threaded More More options
Print post
Permalink
Thanks for your ideas. They are really helpful for me to think about my
question.
Cheers,
2009/11/9 David Winsemius <[hidden email]>

>
> On Nov 9, 2009, at 8:45 AM, rusers.sh wrote:
>
> Hi Johann,
>> Excellent. That is what i really want. A little problem is why the "c.n"
>> does not exist. Should the "c.n" in the memory? Sometimes, i also hope to
>> see "c.n" directly in R besides exporting. Could i see the  "c.n" with
>> some
>> function in the loops?
>>
>
>  a<-c(1:10)
>>> b<-c(rep(1,3),rep(2,3),rep(3,4))
>>> c<-data.frame(a,b)  #c is the example data
>>>
>>
> And not a particularly good choice for a variable name by virtue of
> potential "wetware confusion" with the concatenate function, c(.)
>
>
>
>  num<-c(unique(b))
>>> for (n in num) {
>>>
>> +  c.n <- c[c$b==n,]
>> +  write.csv(c.n, file=paste("c:/c_", n, ".csv", sep=""))}
>>
>
>  num
>>>
>> [1] 1 2 3
>>
>>> c.1
>>>
>> Error: object 'c.1' not found
>>
>
> And you were apparently expecting variables "c.1", "c.2", and "c.3" to be
> constructed in that loop? That is way beyond the R-interpreter's currently
> level of integration with the device drivers reading input from the
> electroencephalograph that must be sitting on your machine.
>
> Perhaps you could have succeeded with:
>
> dftemp <- list() # outside the loop, need a list because results of the
> extract operation will be a df.
> ..........
> dftemp[[n]] <- c[c$b == n, ]  # inside the loop
>  write.csv(dftemp[[n]], file=paste("c:/c_", n, ".csv", sep=""))}
>
> The fact that you immediately wrote it to a file that did not store its
> name would make creation of a list unnecessary inside the loop, but it would
> store the results in a form that could be examined later from the command
> line.
>
>
>
>  c.2
>>>
>> Error: object 'c.2' not found
>>
>>> c.3
>>>
>> Error: object 'c.3' not found
>>
>> Thanks a lot.
>> -----------------
>> Jane Chang
>> Queen's
>>
>>
>>
>> 2009/11/9 Johann Hibschman <[hidden email]>
>>
>> On Nov 8, 2009, at 7:23 PM, rusers.sh wrote:
>>>
>>> for (i in num)  {
>>>
>>>> c_num<-c[c$b==num,]
>>>> write.csv(c_num,file="c:/c_num.csv")
>>>> }
>>>>
>>>> Warning messages:
>>>> 1: In c$b == num :
>>>> longer object length is not a multiple of shorter object length
>>>>
>>>>
>>> This is because you're comparing column b to the entire vector of numbers
>>> (num), not the current number in the iteration (i).  The first line of
>>> the
>>> loop should be "c_num<-c[c$b==i,]".
>>>
>>> From a style point of view, I'd use "n" as my variable, since "i" is too
>>> commonly used as an integer index.
>>>
>>> Also, you will be overwriting the same file, called "c_num.csv", on each
>>> iteration.
>>>
>>> You should try something more like:
>>>
>>> for (n in num) {
>>> c.n <- c[c$b==n,]
>>> write.csv(c.n, file=paste("c:/c_", n, ".csv", sep="")
>>> }
>>>
>>> I hope that helps.
>>>
>>> Cheers,
>>> Johann Hibschman
>>>
>>>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>


--
-----------------
Jane Chang
Queen's

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.