Testing Accuracy is less than half of Trainings Accuracy, and manual testing result is way off

up vote
1
down vote

favorite

Firstly I have to admit that I am total beginner of PyTorch and CNN image classification.

I am making an app to classify cat breeds.

The image sets I gathered have around 300-500 per breed, with a total of 62 breeds, plus one other set which represents non-cats which contains 600 samples. I have split the samples into training and testing respectively at 4:1 ratio.

The training results are quite disappointing. The training accuracy can reach as much as 90% but the testing results are only 39%.

Here are the hyperparameters:

LR is 0.1, momentum is 0.1, and batch_size is 128, wideresnet is using 40 layers of widen factor of 10.

Please see the source code at:

https://github.com/silver-xu/wideresnet-trial

I have searched all over the internet and pretty much 90% of the articles are about pre-compiled datasets like cifar or MNIST. As a result lots of the codes which I found only is optimised for one type of dataset only.

Thanks for all the help! Critics are also Welcome!

Here is training output for Epoch 5:

Epoch: [5][0/170]       Time 0.237 (0.237)      Loss 3.3054 (3.3054)    Prec@1 13.281 (13.281)

Epoch: [5][10/170]      Time 0.229 (0.228)      Loss 3.2665 (3.3118)    Prec@1 14.844 (13.920)

Epoch: [5][20/170]      Time 0.227 (0.227)      Loss 3.0962 (3.2856)    Prec@1 17.969 (14.695)

Epoch: [5][30/170]      Time 0.228 (0.227)      Loss 3.3670 (3.2853)    Prec@1 10.938 (14.844)

Epoch: [5][40/170]      Time 0.229 (0.227)      Loss 3.3259 (3.2917)    Prec@1 15.625 (15.282)

Epoch: [5][50/170]      Time 0.228 (0.227)      Loss 3.2016 (3.2931)    Prec@1 14.844 (14.859)

Epoch: [5][60/170]      Time 0.227 (0.227)      Loss 3.3739 (3.3071)    Prec@1 11.719 (14.677)

Epoch: [5][70/170]      Time 0.227 (0.227)      Loss 3.4417 (3.3042)    Prec@1 15.625 (14.833)

Epoch: [5][80/170]      Time 0.226 (0.227)      Loss 3.2507 (3.2996)    Prec@1 10.938 (14.911)

Epoch: [5][90/170]      Time 0.224 (0.227)      Loss 3.2627 (3.2978)    Prec@1 14.844 (15.093)

Epoch: [5][100/170]     Time 0.226 (0.227)      Loss 3.3668 (3.2946)    Prec@1 14.062 (15.060)

Epoch: [5][110/170]     Time 0.225 (0.227)      Loss 3.2839 (3.2915)    Prec@1 10.156 (14.921)

Epoch: [5][120/170]     Time 0.227 (0.227)      Loss 3.3308 (3.2906)    Prec@1 11.719 (14.837)

Epoch: [5][130/170]     Time 0.224 (0.227)      Loss 3.1656 (3.2885)    Prec@1 21.875 (14.909)

Epoch: [5][140/170]     Time 0.226 (0.227)      Loss 3.2521 (3.2851)    Prec@1 20.312 (14.966)

Epoch: [5][150/170]     Time 0.227 (0.227)      Loss 3.1261 (3.2825)    Prec@1 14.844 (14.989)

Epoch: [5][160/170]     Time 0.227 (0.227)      Loss 3.4400 (3.2802)    Prec@1 10.938 (15.018)

Test: [0/43]    Time 0.262 (0.262)      Loss 3.6978 (3.6978)    Prec@1 8.594 (8.594)

Test: [10/43]   Time 0.074 (0.091)      Loss 3.3584 (3.3736)    Prec@1 17.188 (13.139)

Test: [20/43]   Time 0.074 (0.083)      Loss 3.3834 (3.4058)    Prec@1 12.500 (12.537)

Test: [30/43]   Time 0.074 (0.080)      Loss 3.4457 (3.3994)    Prec@1 14.844 (12.802)

Test: [40/43]   Time 0.074 (0.079)      Loss 3.2851 (3.3946)    Prec@1 16.406 (13.281)

 * Prec@1 13.130

edited Nov 19 at 0:56

asked Nov 18 at 23:24

Silver Xu

Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04

Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21

Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33

1

You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57

@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52

|
show 1 more comment

up vote
1
down vote

favorite

Firstly I have to admit that I am total beginner of PyTorch and CNN image classification.

I am making an app to classify cat breeds.

The training results are quite disappointing. The training accuracy can reach as much as 90% but the testing results are only 39%.

Here are the hyperparameters:

LR is 0.1, momentum is 0.1, and batch_size is 128, wideresnet is using 40 layers of widen factor of 10.

Please see the source code at:

https://github.com/silver-xu/wideresnet-trial

Thanks for all the help! Critics are also Welcome!

Here is training output for Epoch 5:

Epoch: [5][0/170]       Time 0.237 (0.237)      Loss 3.3054 (3.3054)    Prec@1 13.281 (13.281)

Epoch: [5][10/170]      Time 0.229 (0.228)      Loss 3.2665 (3.3118)    Prec@1 14.844 (13.920)

Epoch: [5][20/170]      Time 0.227 (0.227)      Loss 3.0962 (3.2856)    Prec@1 17.969 (14.695)

Epoch: [5][30/170]      Time 0.228 (0.227)      Loss 3.3670 (3.2853)    Prec@1 10.938 (14.844)

Epoch: [5][40/170]      Time 0.229 (0.227)      Loss 3.3259 (3.2917)    Prec@1 15.625 (15.282)

Epoch: [5][50/170]      Time 0.228 (0.227)      Loss 3.2016 (3.2931)    Prec@1 14.844 (14.859)

Epoch: [5][60/170]      Time 0.227 (0.227)      Loss 3.3739 (3.3071)    Prec@1 11.719 (14.677)

Epoch: [5][70/170]      Time 0.227 (0.227)      Loss 3.4417 (3.3042)    Prec@1 15.625 (14.833)

Epoch: [5][80/170]      Time 0.226 (0.227)      Loss 3.2507 (3.2996)    Prec@1 10.938 (14.911)

Epoch: [5][90/170]      Time 0.224 (0.227)      Loss 3.2627 (3.2978)    Prec@1 14.844 (15.093)

Epoch: [5][100/170]     Time 0.226 (0.227)      Loss 3.3668 (3.2946)    Prec@1 14.062 (15.060)

Epoch: [5][110/170]     Time 0.225 (0.227)      Loss 3.2839 (3.2915)    Prec@1 10.156 (14.921)

Epoch: [5][120/170]     Time 0.227 (0.227)      Loss 3.3308 (3.2906)    Prec@1 11.719 (14.837)

Epoch: [5][130/170]     Time 0.224 (0.227)      Loss 3.1656 (3.2885)    Prec@1 21.875 (14.909)

Epoch: [5][140/170]     Time 0.226 (0.227)      Loss 3.2521 (3.2851)    Prec@1 20.312 (14.966)

Epoch: [5][150/170]     Time 0.227 (0.227)      Loss 3.1261 (3.2825)    Prec@1 14.844 (14.989)

Epoch: [5][160/170]     Time 0.227 (0.227)      Loss 3.4400 (3.2802)    Prec@1 10.938 (15.018)

Test: [0/43]    Time 0.262 (0.262)      Loss 3.6978 (3.6978)    Prec@1 8.594 (8.594)

Test: [10/43]   Time 0.074 (0.091)      Loss 3.3584 (3.3736)    Prec@1 17.188 (13.139)

Test: [20/43]   Time 0.074 (0.083)      Loss 3.3834 (3.4058)    Prec@1 12.500 (12.537)

Test: [30/43]   Time 0.074 (0.080)      Loss 3.4457 (3.3994)    Prec@1 14.844 (12.802)

Test: [40/43]   Time 0.074 (0.079)      Loss 3.2851 (3.3946)    Prec@1 16.406 (13.281)

 * Prec@1 13.130

edited Nov 19 at 0:56

asked Nov 18 at 23:24

Silver Xu

Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04

Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21

Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33

1

You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57

@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52

|
show 1 more comment

up vote
1
down vote

favorite

Firstly I have to admit that I am total beginner of PyTorch and CNN image classification.

I am making an app to classify cat breeds.

The training results are quite disappointing. The training accuracy can reach as much as 90% but the testing results are only 39%.

Here are the hyperparameters:

LR is 0.1, momentum is 0.1, and batch_size is 128, wideresnet is using 40 layers of widen factor of 10.

Please see the source code at:

https://github.com/silver-xu/wideresnet-trial

Thanks for all the help! Critics are also Welcome!

Here is training output for Epoch 5:

Epoch: [5][0/170]       Time 0.237 (0.237)      Loss 3.3054 (3.3054)    Prec@1 13.281 (13.281)

Epoch: [5][10/170]      Time 0.229 (0.228)      Loss 3.2665 (3.3118)    Prec@1 14.844 (13.920)

Epoch: [5][20/170]      Time 0.227 (0.227)      Loss 3.0962 (3.2856)    Prec@1 17.969 (14.695)

Epoch: [5][30/170]      Time 0.228 (0.227)      Loss 3.3670 (3.2853)    Prec@1 10.938 (14.844)

Epoch: [5][40/170]      Time 0.229 (0.227)      Loss 3.3259 (3.2917)    Prec@1 15.625 (15.282)

Epoch: [5][50/170]      Time 0.228 (0.227)      Loss 3.2016 (3.2931)    Prec@1 14.844 (14.859)

Epoch: [5][60/170]      Time 0.227 (0.227)      Loss 3.3739 (3.3071)    Prec@1 11.719 (14.677)

Epoch: [5][70/170]      Time 0.227 (0.227)      Loss 3.4417 (3.3042)    Prec@1 15.625 (14.833)

Epoch: [5][80/170]      Time 0.226 (0.227)      Loss 3.2507 (3.2996)    Prec@1 10.938 (14.911)

Epoch: [5][90/170]      Time 0.224 (0.227)      Loss 3.2627 (3.2978)    Prec@1 14.844 (15.093)

Epoch: [5][100/170]     Time 0.226 (0.227)      Loss 3.3668 (3.2946)    Prec@1 14.062 (15.060)

Epoch: [5][110/170]     Time 0.225 (0.227)      Loss 3.2839 (3.2915)    Prec@1 10.156 (14.921)

Epoch: [5][120/170]     Time 0.227 (0.227)      Loss 3.3308 (3.2906)    Prec@1 11.719 (14.837)

Epoch: [5][130/170]     Time 0.224 (0.227)      Loss 3.1656 (3.2885)    Prec@1 21.875 (14.909)

Epoch: [5][140/170]     Time 0.226 (0.227)      Loss 3.2521 (3.2851)    Prec@1 20.312 (14.966)

Epoch: [5][150/170]     Time 0.227 (0.227)      Loss 3.1261 (3.2825)    Prec@1 14.844 (14.989)

Epoch: [5][160/170]     Time 0.227 (0.227)      Loss 3.4400 (3.2802)    Prec@1 10.938 (15.018)

Test: [0/43]    Time 0.262 (0.262)      Loss 3.6978 (3.6978)    Prec@1 8.594 (8.594)

Test: [10/43]   Time 0.074 (0.091)      Loss 3.3584 (3.3736)    Prec@1 17.188 (13.139)

Test: [20/43]   Time 0.074 (0.083)      Loss 3.3834 (3.4058)    Prec@1 12.500 (12.537)

Test: [30/43]   Time 0.074 (0.080)      Loss 3.4457 (3.3994)    Prec@1 14.844 (12.802)

Test: [40/43]   Time 0.074 (0.079)      Loss 3.2851 (3.3946)    Prec@1 16.406 (13.281)

 * Prec@1 13.130

edited Nov 19 at 0:56

asked Nov 18 at 23:24

Silver Xu

Firstly I have to admit that I am total beginner of PyTorch and CNN image classification.

I am making an app to classify cat breeds.

The training results are quite disappointing. The training accuracy can reach as much as 90% but the testing results are only 39%.

Here are the hyperparameters:

LR is 0.1, momentum is 0.1, and batch_size is 128, wideresnet is using 40 layers of widen factor of 10.

Please see the source code at:

https://github.com/silver-xu/wideresnet-trial

Thanks for all the help! Critics are also Welcome!

Here is training output for Epoch 5:

Epoch: [5][0/170]       Time 0.237 (0.237)      Loss 3.3054 (3.3054)    Prec@1 13.281 (13.281)

Epoch: [5][10/170]      Time 0.229 (0.228)      Loss 3.2665 (3.3118)    Prec@1 14.844 (13.920)

Epoch: [5][20/170]      Time 0.227 (0.227)      Loss 3.0962 (3.2856)    Prec@1 17.969 (14.695)

Epoch: [5][30/170]      Time 0.228 (0.227)      Loss 3.3670 (3.2853)    Prec@1 10.938 (14.844)

Epoch: [5][40/170]      Time 0.229 (0.227)      Loss 3.3259 (3.2917)    Prec@1 15.625 (15.282)

Epoch: [5][50/170]      Time 0.228 (0.227)      Loss 3.2016 (3.2931)    Prec@1 14.844 (14.859)

Epoch: [5][60/170]      Time 0.227 (0.227)      Loss 3.3739 (3.3071)    Prec@1 11.719 (14.677)

Epoch: [5][70/170]      Time 0.227 (0.227)      Loss 3.4417 (3.3042)    Prec@1 15.625 (14.833)

Epoch: [5][80/170]      Time 0.226 (0.227)      Loss 3.2507 (3.2996)    Prec@1 10.938 (14.911)

Epoch: [5][90/170]      Time 0.224 (0.227)      Loss 3.2627 (3.2978)    Prec@1 14.844 (15.093)

Epoch: [5][100/170]     Time 0.226 (0.227)      Loss 3.3668 (3.2946)    Prec@1 14.062 (15.060)

Epoch: [5][110/170]     Time 0.225 (0.227)      Loss 3.2839 (3.2915)    Prec@1 10.156 (14.921)

Epoch: [5][120/170]     Time 0.227 (0.227)      Loss 3.3308 (3.2906)    Prec@1 11.719 (14.837)

Epoch: [5][130/170]     Time 0.224 (0.227)      Loss 3.1656 (3.2885)    Prec@1 21.875 (14.909)

Epoch: [5][140/170]     Time 0.226 (0.227)      Loss 3.2521 (3.2851)    Prec@1 20.312 (14.966)

Epoch: [5][150/170]     Time 0.227 (0.227)      Loss 3.1261 (3.2825)    Prec@1 14.844 (14.989)

Epoch: [5][160/170]     Time 0.227 (0.227)      Loss 3.4400 (3.2802)    Prec@1 10.938 (15.018)

Test: [0/43]    Time 0.262 (0.262)      Loss 3.6978 (3.6978)    Prec@1 8.594 (8.594)

Test: [10/43]   Time 0.074 (0.091)      Loss 3.3584 (3.3736)    Prec@1 17.188 (13.139)

Test: [20/43]   Time 0.074 (0.083)      Loss 3.3834 (3.4058)    Prec@1 12.500 (12.537)

Test: [30/43]   Time 0.074 (0.080)      Loss 3.4457 (3.3994)    Prec@1 14.844 (12.802)

Test: [40/43]   Time 0.074 (0.079)      Loss 3.2851 (3.3946)    Prec@1 16.406 (13.281)

 * Prec@1 13.130

python pytorch

edited Nov 19 at 0:56

asked Nov 18 at 23:24

Silver Xu

edited Nov 19 at 0:56

asked Nov 18 at 23:24

Silver Xu

edited Nov 19 at 0:56

asked Nov 18 at 23:24

Silver Xu

asked Nov 18 at 23:24

Silver Xu

asked Nov 18 at 23:24

Silver Xu

Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04

Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21

Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33

1

You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57

@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52

|
show 1 more comment

Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04

Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21

Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33

1

You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57

@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52

Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04

Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21

Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33

You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57

@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52

|
show 1 more comment

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53366458%2ftesting-accuracy-is-less-than-half-of-trainings-accuracy-and-manual-testing-res%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Argthtjtr