Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
384 views
in Technique[技术] by (71.8m points)

numpy - Python - Statistical distribution

I'm quite new to python world. Also, I'm not a statistician. I'm in the need to implementing mathematical models developed by mathematicians in a computer science programming language. I've chosen python after some research. I'm comfortable with programming as such (PHP/HTML/javascript).

I have a column of values that I've extracted from a MySQL database & in need to calculate the below -

1) Normal distribution of it. (I don't have the sigma & mu values. These need to be calculated too apparently). 
2) Mixture of normal distribution
3) Estimate density of normal distribution
4) Calculate 'Z' score

The array of values looks similar to the one below ( I've populated sample data)-

d1 = [3,3,3,3,3,3,3,9,12,6,3,3,3,3,9,21,3,12,3,6,3,30,12,6,3,3,24,30,3,3,3]


mu1, std1 = norm.fit(d1)

The normal distribution, I understand could be calculated as below -

import numpy as np
from scipy.stats import norm

mu, std = norm.fit(data)

Could I please get some pointers on how to get started with (2),(3) & (4) in this please? I'm continuing to look up online as I look forward to hear from experts.

If the question doesn't fully make sense, please do let me know what aspect is missing so that I'll try & get information around that.

I'd very much appreciate any help here please.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Some parts of your question are unclear. It might help to give the context of what you're trying to achieve, rather than what are the specific steps you're taking.

1) + 3) In a Normal distribution - fitting the distribution, and estimating the mean and standard deviation - are basically the same thing. The mean and standard deviation completely determine the distribution.

mu, std = norm.fit(data)

is tantamount to saying "find the mean and standard deviation which best fit the distribution".

4) Calculating the Z score - you'll have to explain what you're trying to do. This usually means how much above (or below) the mean a data point is, in units of standard deviation. Is this what you need here? If so, then it is simply

(np.array(data) - mu) / std

2) Mixture of normal distribution - this is completely unclear. It usually means that the distribution is actually generated by more than a single Normal distribution. What do you mean by this?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...