More output neurons than labels?Multiple output classes in kerasDoes it ever make sense for upper layers to have more nodes than lower layers?How to use neural network's hidden layer output for feature engineering?Training Accuracy stuck in KerasConnect a dense layer to a LSTM architectureCan we use ReLU activation function as the output layer's non-linearity?What causes the network validation loss to always be lower than train loss?CNN for binary classification problemSmaller network width than output size?DeepLearning: does it make sense to have more nodes in the initial layer than inputs--for tabular data

Is it legal to use cash pulled from a credit card to pay the monthly payment on that credit card?

What is the max number of outlets on a GFCI circuit?

Why did Saturn V not head straight to the moon?

How do I address my Catering staff subordinate seen eating from a chafing dish before the customers?

How do I generate distribution of positive numbers only with min, max and mean?

High income, sudden windfall

401(k) investment after being fired. Do I own it?

Why no ";" after "do" in sh loops?

What should I say when a company asks you why someone (a friend) who was fired left?

Is it correct to translate English noun adjuncts into adjectives?

Spoken encryption

What does "see" in "the Holy See" mean?

Timing/Stack question about abilities triggered during combat

Replacing tongue and groove floorboards: but can't find a match

How to get the two pictures aligned

How to write a sincerely religious protagonist without preaching or affirming or judging their worldview?

Is there a reason why I should not use the HaveIBeenPwned API to warn users about exposed passwords?

expansion with *.txt in the shell doesn't work if no .txt file exists

Other than a swing wing, what types of variable geometry have flown?

How important is a good quality camera for good photography?

Automatic Habit of Meditation

(1 of 11: Numberlink) What is Pyramid Cult's Favorite Activity?

Trapped in an ocean Temple in Minecraft?

3D Statue Park: U shapes



More output neurons than labels?


Multiple output classes in kerasDoes it ever make sense for upper layers to have more nodes than lower layers?How to use neural network's hidden layer output for feature engineering?Training Accuracy stuck in KerasConnect a dense layer to a LSTM architectureCan we use ReLU activation function as the output layer's non-linearity?What causes the network validation loss to always be lower than train loss?CNN for binary classification problemSmaller network width than output size?DeepLearning: does it make sense to have more nodes in the initial layer than inputs--for tabular data






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1












$begingroup$


When we train a neural network model for a classification problem, we usually have a dense output layer of size equal to the number of labels we have.



If the layer size was greater, the model can still be trained and have output



How can we interpret such output? are there any applications for this?










share|improve this question









$endgroup$











  • $begingroup$
    Are you looking for a solution for the target which is not yet introduced but probably will come in furlturw?
    $endgroup$
    – vipin bansal
    Jul 16 at 13:38










  • $begingroup$
    @vipinbansal not really
    $endgroup$
    – Abdulrahman Bres
    Jul 16 at 13:58

















1












$begingroup$


When we train a neural network model for a classification problem, we usually have a dense output layer of size equal to the number of labels we have.



If the layer size was greater, the model can still be trained and have output



How can we interpret such output? are there any applications for this?










share|improve this question









$endgroup$











  • $begingroup$
    Are you looking for a solution for the target which is not yet introduced but probably will come in furlturw?
    $endgroup$
    – vipin bansal
    Jul 16 at 13:38










  • $begingroup$
    @vipinbansal not really
    $endgroup$
    – Abdulrahman Bres
    Jul 16 at 13:58













1












1








1





$begingroup$


When we train a neural network model for a classification problem, we usually have a dense output layer of size equal to the number of labels we have.



If the layer size was greater, the model can still be trained and have output



How can we interpret such output? are there any applications for this?










share|improve this question









$endgroup$




When we train a neural network model for a classification problem, we usually have a dense output layer of size equal to the number of labels we have.



If the layer size was greater, the model can still be trained and have output



How can we interpret such output? are there any applications for this?







neural-network deep-learning classification






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jul 16 at 12:51









Abdulrahman BresAbdulrahman Bres

10413 bronze badges




10413 bronze badges











  • $begingroup$
    Are you looking for a solution for the target which is not yet introduced but probably will come in furlturw?
    $endgroup$
    – vipin bansal
    Jul 16 at 13:38










  • $begingroup$
    @vipinbansal not really
    $endgroup$
    – Abdulrahman Bres
    Jul 16 at 13:58
















  • $begingroup$
    Are you looking for a solution for the target which is not yet introduced but probably will come in furlturw?
    $endgroup$
    – vipin bansal
    Jul 16 at 13:38










  • $begingroup$
    @vipinbansal not really
    $endgroup$
    – Abdulrahman Bres
    Jul 16 at 13:58















$begingroup$
Are you looking for a solution for the target which is not yet introduced but probably will come in furlturw?
$endgroup$
– vipin bansal
Jul 16 at 13:38




$begingroup$
Are you looking for a solution for the target which is not yet introduced but probably will come in furlturw?
$endgroup$
– vipin bansal
Jul 16 at 13:38












$begingroup$
@vipinbansal not really
$endgroup$
– Abdulrahman Bres
Jul 16 at 13:58




$begingroup$
@vipinbansal not really
$endgroup$
– Abdulrahman Bres
Jul 16 at 13:58










2 Answers
2






active

oldest

votes


















3












$begingroup$

The output layer is usually the same size as the last dense layer because we apply a loss function to train the model by comparing the last layer to what the output should be. If your output layer was bigger, it's less intuitive what your loss function should be, but the interpretation of your output would likely come from how you define this loss.






share|improve this answer









$endgroup$




















    1












    $begingroup$

    The interpretation of the output depends not only on the architecture of the network, but also on the final-layer activation functions and the training procedure. Most importantly, training a neural net requires you to choose a loss function, which describes how far off the predictions in the final layer are from ground truth. If you can specify a sensible loss function, then you've implicitly defined how the output layer is to be interpreted.



    Offhand, I can't think of a classification problem where it would be helpful to have more output neurons than labels (but maybe someone else is more creative!)



    One kind-of similar case are the advantage actor-critic networks commonly used in reinforcement learning. The ultimate goal of a reinforcement-learning agent is to choose an action from a set of $n$ possible actions, so traditionally we might try a network with $n$ outputs. Actor critic methods actually have $n+1$ outputs. The first $n$ choose an action, and the "extra" neuron tries to estimate the value of the chosen action.






    share|improve this answer









    $endgroup$















      Your Answer








      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "557"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f55774%2fmore-output-neurons-than-labels%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3












      $begingroup$

      The output layer is usually the same size as the last dense layer because we apply a loss function to train the model by comparing the last layer to what the output should be. If your output layer was bigger, it's less intuitive what your loss function should be, but the interpretation of your output would likely come from how you define this loss.






      share|improve this answer









      $endgroup$

















        3












        $begingroup$

        The output layer is usually the same size as the last dense layer because we apply a loss function to train the model by comparing the last layer to what the output should be. If your output layer was bigger, it's less intuitive what your loss function should be, but the interpretation of your output would likely come from how you define this loss.






        share|improve this answer









        $endgroup$















          3












          3








          3





          $begingroup$

          The output layer is usually the same size as the last dense layer because we apply a loss function to train the model by comparing the last layer to what the output should be. If your output layer was bigger, it's less intuitive what your loss function should be, but the interpretation of your output would likely come from how you define this loss.






          share|improve this answer









          $endgroup$



          The output layer is usually the same size as the last dense layer because we apply a loss function to train the model by comparing the last layer to what the output should be. If your output layer was bigger, it's less intuitive what your loss function should be, but the interpretation of your output would likely come from how you define this loss.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jul 16 at 13:32









          Andy MAndy M

          3611 silver badge6 bronze badges




          3611 silver badge6 bronze badges























              1












              $begingroup$

              The interpretation of the output depends not only on the architecture of the network, but also on the final-layer activation functions and the training procedure. Most importantly, training a neural net requires you to choose a loss function, which describes how far off the predictions in the final layer are from ground truth. If you can specify a sensible loss function, then you've implicitly defined how the output layer is to be interpreted.



              Offhand, I can't think of a classification problem where it would be helpful to have more output neurons than labels (but maybe someone else is more creative!)



              One kind-of similar case are the advantage actor-critic networks commonly used in reinforcement learning. The ultimate goal of a reinforcement-learning agent is to choose an action from a set of $n$ possible actions, so traditionally we might try a network with $n$ outputs. Actor critic methods actually have $n+1$ outputs. The first $n$ choose an action, and the "extra" neuron tries to estimate the value of the chosen action.






              share|improve this answer









              $endgroup$

















                1












                $begingroup$

                The interpretation of the output depends not only on the architecture of the network, but also on the final-layer activation functions and the training procedure. Most importantly, training a neural net requires you to choose a loss function, which describes how far off the predictions in the final layer are from ground truth. If you can specify a sensible loss function, then you've implicitly defined how the output layer is to be interpreted.



                Offhand, I can't think of a classification problem where it would be helpful to have more output neurons than labels (but maybe someone else is more creative!)



                One kind-of similar case are the advantage actor-critic networks commonly used in reinforcement learning. The ultimate goal of a reinforcement-learning agent is to choose an action from a set of $n$ possible actions, so traditionally we might try a network with $n$ outputs. Actor critic methods actually have $n+1$ outputs. The first $n$ choose an action, and the "extra" neuron tries to estimate the value of the chosen action.






                share|improve this answer









                $endgroup$















                  1












                  1








                  1





                  $begingroup$

                  The interpretation of the output depends not only on the architecture of the network, but also on the final-layer activation functions and the training procedure. Most importantly, training a neural net requires you to choose a loss function, which describes how far off the predictions in the final layer are from ground truth. If you can specify a sensible loss function, then you've implicitly defined how the output layer is to be interpreted.



                  Offhand, I can't think of a classification problem where it would be helpful to have more output neurons than labels (but maybe someone else is more creative!)



                  One kind-of similar case are the advantage actor-critic networks commonly used in reinforcement learning. The ultimate goal of a reinforcement-learning agent is to choose an action from a set of $n$ possible actions, so traditionally we might try a network with $n$ outputs. Actor critic methods actually have $n+1$ outputs. The first $n$ choose an action, and the "extra" neuron tries to estimate the value of the chosen action.






                  share|improve this answer









                  $endgroup$



                  The interpretation of the output depends not only on the architecture of the network, but also on the final-layer activation functions and the training procedure. Most importantly, training a neural net requires you to choose a loss function, which describes how far off the predictions in the final layer are from ground truth. If you can specify a sensible loss function, then you've implicitly defined how the output layer is to be interpreted.



                  Offhand, I can't think of a classification problem where it would be helpful to have more output neurons than labels (but maybe someone else is more creative!)



                  One kind-of similar case are the advantage actor-critic networks commonly used in reinforcement learning. The ultimate goal of a reinforcement-learning agent is to choose an action from a set of $n$ possible actions, so traditionally we might try a network with $n$ outputs. Actor critic methods actually have $n+1$ outputs. The first $n$ choose an action, and the "extra" neuron tries to estimate the value of the chosen action.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jul 16 at 13:37









                  zachdjzachdj

                  3685 bronze badges




                  3685 bronze badges



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Data Science Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f55774%2fmore-output-neurons-than-labels%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Category:9 (number) SubcategoriesMedia in category "9 (number)"Navigation menuUpload mediaGND ID: 4485639-8Library of Congress authority ID: sh85091979ReasonatorScholiaStatistics

                      Circuit construction for execution of conditional statements using least significant bitHow are two different registers being used as “control”?How exactly is the stated composite state of the two registers being produced using the $R_zz$ controlled rotations?Efficiently performing controlled rotations in HHLWould this quantum algorithm implementation work?How to prepare a superposed states of odd integers from $1$ to $sqrtN$?Why is this implementation of the order finding algorithm not working?Circuit construction for Hamiltonian simulationHow can I invert the least significant bit of a certain term of a superposed state?Implementing an oracleImplementing a controlled sum operation

                      Magento 2 “No Payment Methods” in Admin New OrderHow to integrate Paypal Express Checkout with the Magento APIMagento 1.5 - Sales > Order > edit order and shipping methods disappearAuto Invoice Check/Money Order Payment methodAdd more simple payment methods?Shipping methods not showingWhat should I do to change payment methods if changing the configuration has no effects?1.9 - No Payment Methods showing upMy Payment Methods not Showing for downloadable/virtual product when checkout?Magento2 API to access internal payment methodHow to call an existing payment methods in the registration form?