I have a matlab dataset that looks like this:
year value
1995 90000
1995 53000
1995 80000
1995 60000
1995 37000
1995 42000
1995 13102
1996 35000
1996 50000
1996 32000
1996 47000
1997 36000
1997 90000
1997 NaN
1997 90000
1997 51500
1997 81000
1998 71000
(...)
2020 68000
These are two separate columns of data.
Now I want to count the number of non-NaN observations in column value
between 2010 and 2020 per year i.e. the output should look like:
year count
2010 20
2011 31
(...)
2020 9
If any count is zero, it should show up as zero.
I know I can do it with a very simple loop (example below). But this is very inefficient for a large dataset. I was looking into accumarray, but could not figure out how to do it.
N = 300;
%Generate years vector
years = round(1996 + (2020-1996) .* (rand(N,1)));
years = sort(years);
% Generate values vector
values = rand(N,1);
NaN_position = rand(N,1)>.9; %Now put some random NaNs
values(NaN_position) = NaN;
count = 1;
for y=min(years):max(years)
indicator = years == y;
count_vals(count,1) = sum(not(isnan(values(indicator))));
count = count + 1;
end
question from:
https://stackoverflow.com/questions/65833873/count-number-of-observations-by-group-matlab 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…