QQ Plot and Shapiro Wilk Test Disagree
$begingroup$
My QQ Plot shows that the data is not normally distributed
qqplot(residual_values, fit = True, line = '45')
pylab.show()
It has a skewness of 0.54
residual_values.skew() # 0.5469389365591185
But the p_value of Shapiro test is greater than 0.05, telling me that it is normally distributed
shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)
What is the correct inference from this, Is it Normally Distributed or not?
regression machine-learning
New contributor
$endgroup$
add a comment |
$begingroup$
My QQ Plot shows that the data is not normally distributed
qqplot(residual_values, fit = True, line = '45')
pylab.show()
It has a skewness of 0.54
residual_values.skew() # 0.5469389365591185
But the p_value of Shapiro test is greater than 0.05, telling me that it is normally distributed
shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)
What is the correct inference from this, Is it Normally Distributed or not?
regression machine-learning
New contributor
$endgroup$
3
$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
7 hours ago
3
$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
6 hours ago
$begingroup$
@TheLaconic Sorry I am new to ML.
$endgroup$
– Shinigami
6 hours ago
$begingroup$
@NickCox Thank you Sorry reputation is low to upvote you guys
$endgroup$
– Shinigami
6 hours ago
$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
4 hours ago
add a comment |
$begingroup$
My QQ Plot shows that the data is not normally distributed
qqplot(residual_values, fit = True, line = '45')
pylab.show()
It has a skewness of 0.54
residual_values.skew() # 0.5469389365591185
But the p_value of Shapiro test is greater than 0.05, telling me that it is normally distributed
shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)
What is the correct inference from this, Is it Normally Distributed or not?
regression machine-learning
New contributor
$endgroup$
My QQ Plot shows that the data is not normally distributed
qqplot(residual_values, fit = True, line = '45')
pylab.show()
It has a skewness of 0.54
residual_values.skew() # 0.5469389365591185
But the p_value of Shapiro test is greater than 0.05, telling me that it is normally distributed
shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)
What is the correct inference from this, Is it Normally Distributed or not?
regression machine-learning
regression machine-learning
New contributor
New contributor
New contributor
asked 9 hours ago
ShinigamiShinigami
164
164
New contributor
New contributor
3
$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
7 hours ago
3
$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
6 hours ago
$begingroup$
@TheLaconic Sorry I am new to ML.
$endgroup$
– Shinigami
6 hours ago
$begingroup$
@NickCox Thank you Sorry reputation is low to upvote you guys
$endgroup$
– Shinigami
6 hours ago
$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
4 hours ago
add a comment |
3
$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
7 hours ago
3
$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
6 hours ago
$begingroup$
@TheLaconic Sorry I am new to ML.
$endgroup$
– Shinigami
6 hours ago
$begingroup$
@NickCox Thank you Sorry reputation is low to upvote you guys
$endgroup$
– Shinigami
6 hours ago
$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
4 hours ago
3
3
$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
7 hours ago
$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
7 hours ago
3
3
$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
6 hours ago
$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
6 hours ago
$begingroup$
@TheLaconic Sorry I am new to ML.
$endgroup$
– Shinigami
6 hours ago
$begingroup$
@TheLaconic Sorry I am new to ML.
$endgroup$
– Shinigami
6 hours ago
$begingroup$
@NickCox Thank you Sorry reputation is low to upvote you guys
$endgroup$
– Shinigami
6 hours ago
$begingroup$
@NickCox Thank you Sorry reputation is low to upvote you guys
$endgroup$
– Shinigami
6 hours ago
$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
4 hours ago
$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
4 hours ago
add a comment |
4 Answers
4
active
oldest
votes
$begingroup$
The q-q is consistent with (not "proving") approximate normality, more or less.
The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.
@The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).
$endgroup$
add a comment |
$begingroup$
The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.
$endgroup$
add a comment |
$begingroup$
The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.
The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.
$endgroup$
add a comment |
$begingroup$
My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).
$endgroup$
$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
2 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Shinigami is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396717%2fqq-plot-and-shapiro-wilk-test-disagree%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The q-q is consistent with (not "proving") approximate normality, more or less.
The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.
@The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).
$endgroup$
add a comment |
$begingroup$
The q-q is consistent with (not "proving") approximate normality, more or less.
The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.
@The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).
$endgroup$
add a comment |
$begingroup$
The q-q is consistent with (not "proving") approximate normality, more or less.
The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.
@The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).
$endgroup$
The q-q is consistent with (not "proving") approximate normality, more or less.
The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.
@The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).
answered 5 hours ago
LSCLSC
1697
1697
add a comment |
add a comment |
$begingroup$
The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.
$endgroup$
add a comment |
$begingroup$
The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.
$endgroup$
add a comment |
$begingroup$
The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.
$endgroup$
The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.
answered 5 hours ago
beta1_equals_beta2beta1_equals_beta2
512
512
add a comment |
add a comment |
$begingroup$
The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.
The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.
$endgroup$
add a comment |
$begingroup$
The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.
The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.
$endgroup$
add a comment |
$begingroup$
The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.
The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.
$endgroup$
The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.
The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.
answered 5 hours ago
The LaconicThe Laconic
1,0721615
1,0721615
add a comment |
add a comment |
$begingroup$
My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).
$endgroup$
$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
2 hours ago
add a comment |
$begingroup$
My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).
$endgroup$
$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
2 hours ago
add a comment |
$begingroup$
My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).
$endgroup$
My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).
answered 5 hours ago
user54285user54285
413
413
$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
2 hours ago
add a comment |
$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
2 hours ago
$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
2 hours ago
$begingroup$
I was under the impression formal tests of normality are usually too powerful and too frequently detect immaterial departures from normality. Visualization is generally preferred, as you said (and theoretical knowledge when available).
$endgroup$
– LSC
2 hours ago
add a comment |
Shinigami is a new contributor. Be nice, and check out our Code of Conduct.
Shinigami is a new contributor. Be nice, and check out our Code of Conduct.
Shinigami is a new contributor. Be nice, and check out our Code of Conduct.
Shinigami is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396717%2fqq-plot-and-shapiro-wilk-test-disagree%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
7 hours ago
3
$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
6 hours ago
$begingroup$
@TheLaconic Sorry I am new to ML.
$endgroup$
– Shinigami
6 hours ago
$begingroup$
@NickCox Thank you Sorry reputation is low to upvote you guys
$endgroup$
– Shinigami
6 hours ago
$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b♦
4 hours ago