Testing Accuracy is less than half of Trainings Accuracy, and manual testing result is way off
up vote
1
down vote
favorite
Firstly I have to admit that I am total beginner of PyTorch and CNN image classification.
I am making an app to classify cat breeds.
The image sets I gathered have around 300-500 per breed, with a total of 62 breeds, plus one other set which represents non-cats which contains 600 samples. I have split the samples into training and testing respectively at 4:1 ratio.
The training results are quite disappointing. The training accuracy can reach as much as 90% but the testing results are only 39%.
Here are the hyperparameters:
LR is 0.1, momentum is 0.1, and batch_size is 128, wideresnet is using 40 layers of widen factor of 10.
Please see the source code at:
https://github.com/silver-xu/wideresnet-trial
I have searched all over the internet and pretty much 90% of the articles are about pre-compiled datasets like cifar or MNIST. As a result lots of the codes which I found only is optimised for one type of dataset only.
Thanks for all the help! Critics are also Welcome!
Here is training output for Epoch 5:
Epoch: [5][0/170] Time 0.237 (0.237) Loss 3.3054 (3.3054) Prec@1 13.281 (13.281)
Epoch: [5][10/170] Time 0.229 (0.228) Loss 3.2665 (3.3118) Prec@1 14.844 (13.920)
Epoch: [5][20/170] Time 0.227 (0.227) Loss 3.0962 (3.2856) Prec@1 17.969 (14.695)
Epoch: [5][30/170] Time 0.228 (0.227) Loss 3.3670 (3.2853) Prec@1 10.938 (14.844)
Epoch: [5][40/170] Time 0.229 (0.227) Loss 3.3259 (3.2917) Prec@1 15.625 (15.282)
Epoch: [5][50/170] Time 0.228 (0.227) Loss 3.2016 (3.2931) Prec@1 14.844 (14.859)
Epoch: [5][60/170] Time 0.227 (0.227) Loss 3.3739 (3.3071) Prec@1 11.719 (14.677)
Epoch: [5][70/170] Time 0.227 (0.227) Loss 3.4417 (3.3042) Prec@1 15.625 (14.833)
Epoch: [5][80/170] Time 0.226 (0.227) Loss 3.2507 (3.2996) Prec@1 10.938 (14.911)
Epoch: [5][90/170] Time 0.224 (0.227) Loss 3.2627 (3.2978) Prec@1 14.844 (15.093)
Epoch: [5][100/170] Time 0.226 (0.227) Loss 3.3668 (3.2946) Prec@1 14.062 (15.060)
Epoch: [5][110/170] Time 0.225 (0.227) Loss 3.2839 (3.2915) Prec@1 10.156 (14.921)
Epoch: [5][120/170] Time 0.227 (0.227) Loss 3.3308 (3.2906) Prec@1 11.719 (14.837)
Epoch: [5][130/170] Time 0.224 (0.227) Loss 3.1656 (3.2885) Prec@1 21.875 (14.909)
Epoch: [5][140/170] Time 0.226 (0.227) Loss 3.2521 (3.2851) Prec@1 20.312 (14.966)
Epoch: [5][150/170] Time 0.227 (0.227) Loss 3.1261 (3.2825) Prec@1 14.844 (14.989)
Epoch: [5][160/170] Time 0.227 (0.227) Loss 3.4400 (3.2802) Prec@1 10.938 (15.018)
Test: [0/43] Time 0.262 (0.262) Loss 3.6978 (3.6978) Prec@1 8.594 (8.594)
Test: [10/43] Time 0.074 (0.091) Loss 3.3584 (3.3736) Prec@1 17.188 (13.139)
Test: [20/43] Time 0.074 (0.083) Loss 3.3834 (3.4058) Prec@1 12.500 (12.537)
Test: [30/43] Time 0.074 (0.080) Loss 3.4457 (3.3994) Prec@1 14.844 (12.802)
Test: [40/43] Time 0.074 (0.079) Loss 3.2851 (3.3946) Prec@1 16.406 (13.281)
* Prec@1 13.130
python pytorch
|
show 1 more comment
up vote
1
down vote
favorite
Firstly I have to admit that I am total beginner of PyTorch and CNN image classification.
I am making an app to classify cat breeds.
The image sets I gathered have around 300-500 per breed, with a total of 62 breeds, plus one other set which represents non-cats which contains 600 samples. I have split the samples into training and testing respectively at 4:1 ratio.
The training results are quite disappointing. The training accuracy can reach as much as 90% but the testing results are only 39%.
Here are the hyperparameters:
LR is 0.1, momentum is 0.1, and batch_size is 128, wideresnet is using 40 layers of widen factor of 10.
Please see the source code at:
https://github.com/silver-xu/wideresnet-trial
I have searched all over the internet and pretty much 90% of the articles are about pre-compiled datasets like cifar or MNIST. As a result lots of the codes which I found only is optimised for one type of dataset only.
Thanks for all the help! Critics are also Welcome!
Here is training output for Epoch 5:
Epoch: [5][0/170] Time 0.237 (0.237) Loss 3.3054 (3.3054) Prec@1 13.281 (13.281)
Epoch: [5][10/170] Time 0.229 (0.228) Loss 3.2665 (3.3118) Prec@1 14.844 (13.920)
Epoch: [5][20/170] Time 0.227 (0.227) Loss 3.0962 (3.2856) Prec@1 17.969 (14.695)
Epoch: [5][30/170] Time 0.228 (0.227) Loss 3.3670 (3.2853) Prec@1 10.938 (14.844)
Epoch: [5][40/170] Time 0.229 (0.227) Loss 3.3259 (3.2917) Prec@1 15.625 (15.282)
Epoch: [5][50/170] Time 0.228 (0.227) Loss 3.2016 (3.2931) Prec@1 14.844 (14.859)
Epoch: [5][60/170] Time 0.227 (0.227) Loss 3.3739 (3.3071) Prec@1 11.719 (14.677)
Epoch: [5][70/170] Time 0.227 (0.227) Loss 3.4417 (3.3042) Prec@1 15.625 (14.833)
Epoch: [5][80/170] Time 0.226 (0.227) Loss 3.2507 (3.2996) Prec@1 10.938 (14.911)
Epoch: [5][90/170] Time 0.224 (0.227) Loss 3.2627 (3.2978) Prec@1 14.844 (15.093)
Epoch: [5][100/170] Time 0.226 (0.227) Loss 3.3668 (3.2946) Prec@1 14.062 (15.060)
Epoch: [5][110/170] Time 0.225 (0.227) Loss 3.2839 (3.2915) Prec@1 10.156 (14.921)
Epoch: [5][120/170] Time 0.227 (0.227) Loss 3.3308 (3.2906) Prec@1 11.719 (14.837)
Epoch: [5][130/170] Time 0.224 (0.227) Loss 3.1656 (3.2885) Prec@1 21.875 (14.909)
Epoch: [5][140/170] Time 0.226 (0.227) Loss 3.2521 (3.2851) Prec@1 20.312 (14.966)
Epoch: [5][150/170] Time 0.227 (0.227) Loss 3.1261 (3.2825) Prec@1 14.844 (14.989)
Epoch: [5][160/170] Time 0.227 (0.227) Loss 3.4400 (3.2802) Prec@1 10.938 (15.018)
Test: [0/43] Time 0.262 (0.262) Loss 3.6978 (3.6978) Prec@1 8.594 (8.594)
Test: [10/43] Time 0.074 (0.091) Loss 3.3584 (3.3736) Prec@1 17.188 (13.139)
Test: [20/43] Time 0.074 (0.083) Loss 3.3834 (3.4058) Prec@1 12.500 (12.537)
Test: [30/43] Time 0.074 (0.080) Loss 3.4457 (3.3994) Prec@1 14.844 (12.802)
Test: [40/43] Time 0.074 (0.079) Loss 3.2851 (3.3946) Prec@1 16.406 (13.281)
* Prec@1 13.130
python pytorch
Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04
Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21
Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33
1
You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57
@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52
|
show 1 more comment
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Firstly I have to admit that I am total beginner of PyTorch and CNN image classification.
I am making an app to classify cat breeds.
The image sets I gathered have around 300-500 per breed, with a total of 62 breeds, plus one other set which represents non-cats which contains 600 samples. I have split the samples into training and testing respectively at 4:1 ratio.
The training results are quite disappointing. The training accuracy can reach as much as 90% but the testing results are only 39%.
Here are the hyperparameters:
LR is 0.1, momentum is 0.1, and batch_size is 128, wideresnet is using 40 layers of widen factor of 10.
Please see the source code at:
https://github.com/silver-xu/wideresnet-trial
I have searched all over the internet and pretty much 90% of the articles are about pre-compiled datasets like cifar or MNIST. As a result lots of the codes which I found only is optimised for one type of dataset only.
Thanks for all the help! Critics are also Welcome!
Here is training output for Epoch 5:
Epoch: [5][0/170] Time 0.237 (0.237) Loss 3.3054 (3.3054) Prec@1 13.281 (13.281)
Epoch: [5][10/170] Time 0.229 (0.228) Loss 3.2665 (3.3118) Prec@1 14.844 (13.920)
Epoch: [5][20/170] Time 0.227 (0.227) Loss 3.0962 (3.2856) Prec@1 17.969 (14.695)
Epoch: [5][30/170] Time 0.228 (0.227) Loss 3.3670 (3.2853) Prec@1 10.938 (14.844)
Epoch: [5][40/170] Time 0.229 (0.227) Loss 3.3259 (3.2917) Prec@1 15.625 (15.282)
Epoch: [5][50/170] Time 0.228 (0.227) Loss 3.2016 (3.2931) Prec@1 14.844 (14.859)
Epoch: [5][60/170] Time 0.227 (0.227) Loss 3.3739 (3.3071) Prec@1 11.719 (14.677)
Epoch: [5][70/170] Time 0.227 (0.227) Loss 3.4417 (3.3042) Prec@1 15.625 (14.833)
Epoch: [5][80/170] Time 0.226 (0.227) Loss 3.2507 (3.2996) Prec@1 10.938 (14.911)
Epoch: [5][90/170] Time 0.224 (0.227) Loss 3.2627 (3.2978) Prec@1 14.844 (15.093)
Epoch: [5][100/170] Time 0.226 (0.227) Loss 3.3668 (3.2946) Prec@1 14.062 (15.060)
Epoch: [5][110/170] Time 0.225 (0.227) Loss 3.2839 (3.2915) Prec@1 10.156 (14.921)
Epoch: [5][120/170] Time 0.227 (0.227) Loss 3.3308 (3.2906) Prec@1 11.719 (14.837)
Epoch: [5][130/170] Time 0.224 (0.227) Loss 3.1656 (3.2885) Prec@1 21.875 (14.909)
Epoch: [5][140/170] Time 0.226 (0.227) Loss 3.2521 (3.2851) Prec@1 20.312 (14.966)
Epoch: [5][150/170] Time 0.227 (0.227) Loss 3.1261 (3.2825) Prec@1 14.844 (14.989)
Epoch: [5][160/170] Time 0.227 (0.227) Loss 3.4400 (3.2802) Prec@1 10.938 (15.018)
Test: [0/43] Time 0.262 (0.262) Loss 3.6978 (3.6978) Prec@1 8.594 (8.594)
Test: [10/43] Time 0.074 (0.091) Loss 3.3584 (3.3736) Prec@1 17.188 (13.139)
Test: [20/43] Time 0.074 (0.083) Loss 3.3834 (3.4058) Prec@1 12.500 (12.537)
Test: [30/43] Time 0.074 (0.080) Loss 3.4457 (3.3994) Prec@1 14.844 (12.802)
Test: [40/43] Time 0.074 (0.079) Loss 3.2851 (3.3946) Prec@1 16.406 (13.281)
* Prec@1 13.130
python pytorch
Firstly I have to admit that I am total beginner of PyTorch and CNN image classification.
I am making an app to classify cat breeds.
The image sets I gathered have around 300-500 per breed, with a total of 62 breeds, plus one other set which represents non-cats which contains 600 samples. I have split the samples into training and testing respectively at 4:1 ratio.
The training results are quite disappointing. The training accuracy can reach as much as 90% but the testing results are only 39%.
Here are the hyperparameters:
LR is 0.1, momentum is 0.1, and batch_size is 128, wideresnet is using 40 layers of widen factor of 10.
Please see the source code at:
https://github.com/silver-xu/wideresnet-trial
I have searched all over the internet and pretty much 90% of the articles are about pre-compiled datasets like cifar or MNIST. As a result lots of the codes which I found only is optimised for one type of dataset only.
Thanks for all the help! Critics are also Welcome!
Here is training output for Epoch 5:
Epoch: [5][0/170] Time 0.237 (0.237) Loss 3.3054 (3.3054) Prec@1 13.281 (13.281)
Epoch: [5][10/170] Time 0.229 (0.228) Loss 3.2665 (3.3118) Prec@1 14.844 (13.920)
Epoch: [5][20/170] Time 0.227 (0.227) Loss 3.0962 (3.2856) Prec@1 17.969 (14.695)
Epoch: [5][30/170] Time 0.228 (0.227) Loss 3.3670 (3.2853) Prec@1 10.938 (14.844)
Epoch: [5][40/170] Time 0.229 (0.227) Loss 3.3259 (3.2917) Prec@1 15.625 (15.282)
Epoch: [5][50/170] Time 0.228 (0.227) Loss 3.2016 (3.2931) Prec@1 14.844 (14.859)
Epoch: [5][60/170] Time 0.227 (0.227) Loss 3.3739 (3.3071) Prec@1 11.719 (14.677)
Epoch: [5][70/170] Time 0.227 (0.227) Loss 3.4417 (3.3042) Prec@1 15.625 (14.833)
Epoch: [5][80/170] Time 0.226 (0.227) Loss 3.2507 (3.2996) Prec@1 10.938 (14.911)
Epoch: [5][90/170] Time 0.224 (0.227) Loss 3.2627 (3.2978) Prec@1 14.844 (15.093)
Epoch: [5][100/170] Time 0.226 (0.227) Loss 3.3668 (3.2946) Prec@1 14.062 (15.060)
Epoch: [5][110/170] Time 0.225 (0.227) Loss 3.2839 (3.2915) Prec@1 10.156 (14.921)
Epoch: [5][120/170] Time 0.227 (0.227) Loss 3.3308 (3.2906) Prec@1 11.719 (14.837)
Epoch: [5][130/170] Time 0.224 (0.227) Loss 3.1656 (3.2885) Prec@1 21.875 (14.909)
Epoch: [5][140/170] Time 0.226 (0.227) Loss 3.2521 (3.2851) Prec@1 20.312 (14.966)
Epoch: [5][150/170] Time 0.227 (0.227) Loss 3.1261 (3.2825) Prec@1 14.844 (14.989)
Epoch: [5][160/170] Time 0.227 (0.227) Loss 3.4400 (3.2802) Prec@1 10.938 (15.018)
Test: [0/43] Time 0.262 (0.262) Loss 3.6978 (3.6978) Prec@1 8.594 (8.594)
Test: [10/43] Time 0.074 (0.091) Loss 3.3584 (3.3736) Prec@1 17.188 (13.139)
Test: [20/43] Time 0.074 (0.083) Loss 3.3834 (3.4058) Prec@1 12.500 (12.537)
Test: [30/43] Time 0.074 (0.080) Loss 3.4457 (3.3994) Prec@1 14.844 (12.802)
Test: [40/43] Time 0.074 (0.079) Loss 3.2851 (3.3946) Prec@1 16.406 (13.281)
* Prec@1 13.130
python pytorch
python pytorch
edited Nov 19 at 0:56
asked Nov 18 at 23:24
Silver Xu
62
62
Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04
Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21
Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33
1
You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57
@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52
|
show 1 more comment
Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04
Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21
Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33
1
You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57
@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52
Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04
Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04
Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21
Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21
Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33
Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33
1
1
You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57
You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57
@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52
@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52
|
show 1 more comment
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53366458%2ftesting-accuracy-is-less-than-half-of-trainings-accuracy-and-manual-testing-res%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Have you trained with a validation set? How does the validation loss compare to the training loss?
– Charles Landau
Nov 19 at 0:04
Great! It sounds like you may be overfitting and I would look for early divergence in validation loss (e.g. in epoch < 5) as another indicator of overfitting.
– Charles Landau
Nov 19 at 0:21
Here you go. Greatly appreciated for your help.
– Silver Xu
Nov 19 at 0:33
1
You don't appear to be training with a validation set, see this discussion for some basic approaches to train, validation, test patterns in pytorch github.com/pytorch/pytorch/issues/1106
– Charles Landau
Nov 19 at 0:57
@Charles Landau I've done what you commented above. The result improved to around 60%. Is there anything I can improve further?
– Silver Xu
Nov 19 at 10:52