Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
169 views
in Technique[技术] by (71.8m points)

python - Turning loop comprehensions into numpy form

Is there anyway I could convert the standard deviation function to be computed just like the y_mean and xy_mean functions. I don't want to use a for loop for calculating the standard deviation or a function that takes a lot of RAM memory. I am trying to use np.convolve() function for calculating the standard deviation std.

variables:

number = 5
PC_list= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])

Vanilla python functions:

y_mean = sum(PC_list[i:i+number])/number
xy_mean = sum([x * (i + 1) for i, x in enumerate(PC_list[i:i+number])])/number
std = (sum([(k - y_mean)**2 for k in PC_list[i:i+number]])/(number-1))**0.5

Numpy versions:

y_mean = (np.convolve(PC_list, np.ones(shape=(number)), mode='valid')/number)[:-1]
xy_mean = (np.convolve(PC_list, np.arange(number, 0, -1), mode='valid'))[:-1]
std = ?
question from:https://stackoverflow.com/questions/65863226/turning-loop-comprehensions-into-numpy-form

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use np.lib.stride_tricks.as_strided and np.std with ddof=1:

>>> np.std(
        np.lib.stride_tricks.as_strided(
            PC_list, 
            shape=(PC_list.shape[0] - number + 1, number), 
            strides=PC_list.strides*2
        ), 
        axis=-1, 
        ddof=1
    )
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
       14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
        9.58525968,  5.32623099, 10.61466493, 23.71209646, 27.85489139,
       23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
       11.7602241 ,  9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
        9.02825105])

Otherwise you can move use pandas.Series.rolling.std, pandas.Series.dropna then pandas.Series.to_numpy:

>>> pd.Series(PC_list).rolling(number).std().dropna().to_numpy()
 
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
       14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
        9.58525968,  5.32623099, 10.61466493, 23.71209646, 27.85489139,
       23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
       11.7602241 ,  9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
        9.02825105])

EXPLANATION: np.lib.stride_tricks.as_strided is used to reshape the array in a special way, that resembles rolling:

>>> np.lib.stride_tricks.as_strided(
            PC_list, 
            shape=(PC_list.shape[0] - number + 1, number), 
            strides=PC_list.strides*2
        )

array([[457.334015, 424.440002, 394.79599 , 408.903992, 398.821014],   #index: 0,1,2,3,4
       [424.440002, 394.79599 , 408.903992, 398.821014, 402.152008],   #index: 1,2,3,4,5
       [394.79599 , 408.903992, 398.821014, 402.152008, 435.790985],   #index: 2,3,4,5,6
       [408.903992, 398.821014, 402.152008, 435.790985, 423.204987],   # ... and so on
       [398.821014, 402.152008, 435.790985, 423.204987, 411.574005],
       [402.152008, 435.790985, 423.204987, 411.574005, 404.424988],
       [435.790985, 423.204987, 411.574005, 404.424988, 399.519989],
       [423.204987, 411.574005, 404.424988, 399.519989, 377.181   ],
       [411.574005, 404.424988, 399.519989, 377.181   , 375.46701 ],
       [404.424988, 399.519989, 377.181   , 375.46701 , 386.944   ],
       [399.519989, 377.181   , 375.46701 , 386.944   , 383.61499 ],
       [377.181   , 375.46701 , 386.944   , 383.61499 , 375.071991],
       [375.46701 , 386.944   , 383.61499 , 375.071991, 359.511993],
       [386.944   , 383.61499 , 375.071991, 359.511993, 328.865997],
       [383.61499 , 375.071991, 359.511993, 328.865997, 320.51001 ],
       [375.071991, 359.511993, 328.865997, 320.51001 , 330.07901 ],
       [359.511993, 328.865997, 320.51001 , 330.07901 , 336.187012],
       [328.865997, 320.51001 , 330.07901 , 336.187012, 352.940002],
       [320.51001 , 330.07901 , 336.187012, 352.940002, 365.026001],
       [330.07901 , 336.187012, 352.940002, 365.026001, 361.562012],
       [336.187012, 352.940002, 365.026001, 361.562012, 362.299011],
       [352.940002, 365.026001, 361.562012, 362.299011, 378.549011],
       [365.026001, 361.562012, 362.299011, 378.549011, 390.414001],
       [361.562012, 362.299011, 378.549011, 390.414001, 400.869995],
       [362.299011, 378.549011, 390.414001, 400.869995, 394.77301 ],
       [378.549011, 390.414001, 400.869995, 394.77301 , 382.556   ]])

Now if we take the std of the above array across the last axis, to obtain the rolling std. By default numpy uses ddof=0, i.e. Delta Degrees of Freedom = 0, which means for number amount of samples, the divisor will be equal to number - 0. Now as you want number - 1, you need ddof=1.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...