If you want to exclude the non-smokers, you have a few options. The easiest is probably this:
mean(bwght[bwght$cigs>0,"cigs"])
With a data frame, the first variable is the row and the next is the column. So, you can subset using dataframe[1,2]
to get the first row, second column. You can also use logic in the row selection. By using bwght$cigs>0
as the first element, you are subsetting to only have the rows where cigs
is not zero.
Your other ones didn't work for the following reasons:
mean(bwght$cigs| bwght$cigs>0)
This is effectively a logical comparison. You're asking for the TRUE / FALSE result of bwght$cigs OR bwght$cigs>0
, and then taking the mean on it. I'm not totally sure, but I think R can't even take data typed as logical for the mean()
function.
mean(bwght$cigs>0 | bwght$cigs=TRUE)
Same problem. You use the |
sign, which returns a logical, and R is trying to take the mean of logicals.
if(bwght$cigs > 0){sum(bwght$cigs)}
By any chance, were you a SAS programmer originally? This looks like how I used to type at first. Basically, if()
doesn't work the same way in R as it does in SAS. In that example, you are using bwght$cigs > 0
as the if condition, which won't work because R will only look at the first element of the vector resulting from bwght$cigs > 0. R handles looping differently from SAS - check out functions like lapply, tapply, and so on.
x <-as.numeric(bwght$cigs, rm="0")
mean(x)
I honestly don't know what this would do. It might work if rm="0"
didn't have quotes...?
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…