Mathematical formulation of Support Vector Machines?Non-linear transformations input dataset for support vector machinesVisualizing Support Vector Machines (SVM) with Multiple Explanatory VariablesFeature selection for Support Vector MachinesWhere exactly does $geq 1$ come from in SVMs optimization problem constraint?Why are support vector machines good at classifying images?Support Vector Machines: How can you generate higher dimensional data from lower ones?Minimum numbers of support vectorssolution of quadratic optimization in support vector machinesMaximize the margin formula in support vector machines algorithmFormulation of Optimization Problem in SVM

Stellen - Putting, or putting away?

How does mathematics work?

Why is there an extra "t" in Lemmatization?

Why do we need an estimator to be consistent?

My current job follows "worst practices". How can I talk about my experience in an interview without giving off red flags?

How can I disable a reserved profile?

What is the standard representation of a stop which could be either ejective or aspirated?

Reset Column Header Index

Which dice game has a board with 9x9 squares that has different colors on the diagonals and midway on some edges?

What does the following chess proverb mean: "Chess is a sea where a gnat may drink from and an elephant may bathe in."

What does a Nintendo Game Boy do when turned on without a game cartridge inserted?

Finding Greatest Common Divisor using LuaLatex

Magic is the twist

Host telling me to cancel my booking in exchange for a discount?

Did Don Young threaten John Boehner with a 10 inch blade to the throat?

Reissue US, UK, Canada visas in stolen passports

Calculating Fibonacci sequence in several different ways

Importance of moon phases for Apollo missions

How should I handle a question regarding my regrets during an interview?

Cargo capacity of a kayak

Is there an English word to describe when a sound "protrudes"?

Adding gears to my grandson's 12" bike

Are there foods that astronauts are explicitly never allowed to eat?

Book in which the "mountain" in the distance was a hole in the flat world



Mathematical formulation of Support Vector Machines?


Non-linear transformations input dataset for support vector machinesVisualizing Support Vector Machines (SVM) with Multiple Explanatory VariablesFeature selection for Support Vector MachinesWhere exactly does $geq 1$ come from in SVMs optimization problem constraint?Why are support vector machines good at classifying images?Support Vector Machines: How can you generate higher dimensional data from lower ones?Minimum numbers of support vectorssolution of quadratic optimization in support vector machinesMaximize the margin formula in support vector machines algorithmFormulation of Optimization Problem in SVM






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








3












$begingroup$


I'm trying to learn maths behind SVM (hard margin) but due to different forms of mathematical formulations I'm bit confused.



  • Assume we have two sets of points $text(i.e. positives, negatives)$ one on each side of hyperplane $pi$.



  • So the equation of the margin maximizing plane $pi$ can be written as,
    $$pi:;W^TX+b = 0$$



    • If $yin$ $(1,-1)$ then,


$$pi^+:; W^TX + b=+1$$
$$pi^-:; W^TX + b=-1$$



  • Here $pi^+$ and $pi^-$ are parallel to plane $pi$ and they are also parallel to each other. Now the objective would be to find a hyperplane $pi$ which maximizes the distance between $pi^+$ and $pi^-$.


Here $pi^+$ and $pi^-$ are the hyperplanes passing through positive and negative support vectors respectively




  • According to wikipedia about SVM I've found that distance/margin between $pi^+$ and $pi^-$ can be written as,
    $$hookrightarrowfrac2$$


  • Now if I put together everything this is the constraint optimization problem we want to solve,
    $$textfind;w_*,b_* = underbraceargmax_w,bfrac2 rightarrowtextmargin$$
    $$hookrightarrow texts.t;;y_i(w^Tx;+;b);ge 1;;;forall;x_i$$


Before proceeding to my doubts please do confirm if my understanding above is correct? If you find any mistakes please do correct me.



  • How to derive margin between $pi^+$ and $pi^-$ to be $frac2?$ I did find a similar question asked here but I couldn't understand the formulations used there? If possible can anyone explain it in the formulation I used above?

  • How can $y_i(w^Tx+b)ge1;;forall;x_i$?









share|improve this question











$endgroup$


















    3












    $begingroup$


    I'm trying to learn maths behind SVM (hard margin) but due to different forms of mathematical formulations I'm bit confused.



    • Assume we have two sets of points $text(i.e. positives, negatives)$ one on each side of hyperplane $pi$.



    • So the equation of the margin maximizing plane $pi$ can be written as,
      $$pi:;W^TX+b = 0$$



      • If $yin$ $(1,-1)$ then,


    $$pi^+:; W^TX + b=+1$$
    $$pi^-:; W^TX + b=-1$$



    • Here $pi^+$ and $pi^-$ are parallel to plane $pi$ and they are also parallel to each other. Now the objective would be to find a hyperplane $pi$ which maximizes the distance between $pi^+$ and $pi^-$.


    Here $pi^+$ and $pi^-$ are the hyperplanes passing through positive and negative support vectors respectively




    • According to wikipedia about SVM I've found that distance/margin between $pi^+$ and $pi^-$ can be written as,
      $$hookrightarrowfrac2$$


    • Now if I put together everything this is the constraint optimization problem we want to solve,
      $$textfind;w_*,b_* = underbraceargmax_w,bfrac2 rightarrowtextmargin$$
      $$hookrightarrow texts.t;;y_i(w^Tx;+;b);ge 1;;;forall;x_i$$


    Before proceeding to my doubts please do confirm if my understanding above is correct? If you find any mistakes please do correct me.



    • How to derive margin between $pi^+$ and $pi^-$ to be $frac2?$ I did find a similar question asked here but I couldn't understand the formulations used there? If possible can anyone explain it in the formulation I used above?

    • How can $y_i(w^Tx+b)ge1;;forall;x_i$?









    share|improve this question











    $endgroup$














      3












      3








      3


      1



      $begingroup$


      I'm trying to learn maths behind SVM (hard margin) but due to different forms of mathematical formulations I'm bit confused.



      • Assume we have two sets of points $text(i.e. positives, negatives)$ one on each side of hyperplane $pi$.



      • So the equation of the margin maximizing plane $pi$ can be written as,
        $$pi:;W^TX+b = 0$$



        • If $yin$ $(1,-1)$ then,


      $$pi^+:; W^TX + b=+1$$
      $$pi^-:; W^TX + b=-1$$



      • Here $pi^+$ and $pi^-$ are parallel to plane $pi$ and they are also parallel to each other. Now the objective would be to find a hyperplane $pi$ which maximizes the distance between $pi^+$ and $pi^-$.


      Here $pi^+$ and $pi^-$ are the hyperplanes passing through positive and negative support vectors respectively




      • According to wikipedia about SVM I've found that distance/margin between $pi^+$ and $pi^-$ can be written as,
        $$hookrightarrowfrac2$$


      • Now if I put together everything this is the constraint optimization problem we want to solve,
        $$textfind;w_*,b_* = underbraceargmax_w,bfrac2 rightarrowtextmargin$$
        $$hookrightarrow texts.t;;y_i(w^Tx;+;b);ge 1;;;forall;x_i$$


      Before proceeding to my doubts please do confirm if my understanding above is correct? If you find any mistakes please do correct me.



      • How to derive margin between $pi^+$ and $pi^-$ to be $frac2?$ I did find a similar question asked here but I couldn't understand the formulations used there? If possible can anyone explain it in the formulation I used above?

      • How can $y_i(w^Tx+b)ge1;;forall;x_i$?









      share|improve this question











      $endgroup$




      I'm trying to learn maths behind SVM (hard margin) but due to different forms of mathematical formulations I'm bit confused.



      • Assume we have two sets of points $text(i.e. positives, negatives)$ one on each side of hyperplane $pi$.



      • So the equation of the margin maximizing plane $pi$ can be written as,
        $$pi:;W^TX+b = 0$$



        • If $yin$ $(1,-1)$ then,


      $$pi^+:; W^TX + b=+1$$
      $$pi^-:; W^TX + b=-1$$



      • Here $pi^+$ and $pi^-$ are parallel to plane $pi$ and they are also parallel to each other. Now the objective would be to find a hyperplane $pi$ which maximizes the distance between $pi^+$ and $pi^-$.


      Here $pi^+$ and $pi^-$ are the hyperplanes passing through positive and negative support vectors respectively




      • According to wikipedia about SVM I've found that distance/margin between $pi^+$ and $pi^-$ can be written as,
        $$hookrightarrowfrac2$$


      • Now if I put together everything this is the constraint optimization problem we want to solve,
        $$textfind;w_*,b_* = underbraceargmax_w,bfrac2 rightarrowtextmargin$$
        $$hookrightarrow texts.t;;y_i(w^Tx;+;b);ge 1;;;forall;x_i$$


      Before proceeding to my doubts please do confirm if my understanding above is correct? If you find any mistakes please do correct me.



      • How to derive margin between $pi^+$ and $pi^-$ to be $frac2?$ I did find a similar question asked here but I couldn't understand the formulations used there? If possible can anyone explain it in the formulation I used above?

      • How can $y_i(w^Tx+b)ge1;;forall;x_i$?






      machine-learning svm optimization linear-algebra






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jul 12 at 22:56







      user_6396

















      asked Jul 12 at 21:47









      user_6396user_6396

      2961 gold badge1 silver badge9 bronze badges




      2961 gold badge1 silver badge9 bronze badges




















          1 Answer
          1






          active

          oldest

          votes


















          6












          $begingroup$

          Your understandings are right.




          deriving the margin to be $frac2$




          we know that $w cdot x +b = 1$



          If we move from point z in $w cdot x +b = 1$ to the $w cdot x +b = 0$ we land in a point $lambda$. This line that we have passed or this margin between the two lines $w cdot x +b = 1$ and $w cdot x +b = 0$ is the margin between them which we call $gamma$



          For calculating the margin, we know that we have moved from z, in opposite direction of w to point $lambda$. Hence this margin $gamma$ would be equal to $z - margin cdot fracw = z - gamma cdot fracw =$ (we have moved in the opposite direction of w, we just want the direction so we normalize w to be a unit vector $fracw$)



          Since this $lambda$ point lies in the decision boundary we know that it should suit in line $w cdot x + b = 0$
          Hence we set is in this line in place of x:



          $$w cdot x + b = 0$$
          $$w cdot (z - gamma cdot fracw) + b = 0$$
          $$w cdot z + b - w cdot gamma cdot fracw) = 0$$
          $$w cdot z + b = w cdot gamma cdot fracw$$
          we know that $w cdot z +b = 1$ (z is the point on $w cdot x +b = 1)$
          $$1 = w cdot gamma cdot fracw$$
          $$gamma= frac1w cdot fracw $$
          we also know that $w cdot w = |w|^2$, hence:
          $$gamma= frac1$$
          Why is in your formula 2 instead of 1? because I have calculated the margin between the middle line and the upper, not the whole margin.




          • How can $y_i(w^Tx+b)ge1;;forall;x_i$?



          We want to classify the points in the +1 part as +1 and the points in the -1 part as -1, since $(w^Tx_i+b)$ is the predicted value and $y_i$ is the actual value for each point, if it is classified correctly, then the predicted and actual values should be same so their production $y_i(w^Tx+b)$ should be positive (the term >= 0 is substituded by >= 1 because it is a stronger condition)



          The transpose is in order to be able to calculate the dot product. I just wanted to show the logic of dot product hence, didn't write transpose




          For calculating the total distance between lines $w cdot x + b = -1$ and $w cdot x + b = 1$:



          Either you can multiply the calculated margin by 2 Or if you want to directly find it, you can consider a point $alpha$ in line $w cdot x + b = -1$. then we know that the distance between these two lines is twice the value of $gamma$, hence if we want to move from the point z to $alpha$, the total margin (passed length) would be:
          $$z - 2 cdot gamma cdot fracw$$ then we can calculate the margin from here.



          derived from ML course of UCSD by Prof. Sanjoy Dasgupta






          share|improve this answer











          $endgroup$








          • 2




            $begingroup$
            "we know that we have moved from z, in direction of w to point λ" Here z lies in ($w^t.x+b=+1$.) we also know that normal(w) to the plane $pi$ is in direction of positive points right? So aren't we moving from z to point $lambda$ in opposite direction of w?
            $endgroup$
            – user_6396
            Jul 12 at 23:50











          • $begingroup$
            How can I calculate whole margin so that formula becomes $frac2$? It will give me an intuition.
            $endgroup$
            – user_6396
            Jul 13 at 0:15







          • 1




            $begingroup$
            I added it to the bottom of my post.
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 0:58






          • 1




            $begingroup$
            I really appreciate all the help. Thanks :)
            $endgroup$
            – user_6396
            Jul 13 at 1:03






          • 1




            $begingroup$
            I'm happy that I could help. :)
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 1:03













          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f55592%2fmathematical-formulation-of-support-vector-machines%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          6












          $begingroup$

          Your understandings are right.




          deriving the margin to be $frac2$




          we know that $w cdot x +b = 1$



          If we move from point z in $w cdot x +b = 1$ to the $w cdot x +b = 0$ we land in a point $lambda$. This line that we have passed or this margin between the two lines $w cdot x +b = 1$ and $w cdot x +b = 0$ is the margin between them which we call $gamma$



          For calculating the margin, we know that we have moved from z, in opposite direction of w to point $lambda$. Hence this margin $gamma$ would be equal to $z - margin cdot fracw = z - gamma cdot fracw =$ (we have moved in the opposite direction of w, we just want the direction so we normalize w to be a unit vector $fracw$)



          Since this $lambda$ point lies in the decision boundary we know that it should suit in line $w cdot x + b = 0$
          Hence we set is in this line in place of x:



          $$w cdot x + b = 0$$
          $$w cdot (z - gamma cdot fracw) + b = 0$$
          $$w cdot z + b - w cdot gamma cdot fracw) = 0$$
          $$w cdot z + b = w cdot gamma cdot fracw$$
          we know that $w cdot z +b = 1$ (z is the point on $w cdot x +b = 1)$
          $$1 = w cdot gamma cdot fracw$$
          $$gamma= frac1w cdot fracw $$
          we also know that $w cdot w = |w|^2$, hence:
          $$gamma= frac1$$
          Why is in your formula 2 instead of 1? because I have calculated the margin between the middle line and the upper, not the whole margin.




          • How can $y_i(w^Tx+b)ge1;;forall;x_i$?



          We want to classify the points in the +1 part as +1 and the points in the -1 part as -1, since $(w^Tx_i+b)$ is the predicted value and $y_i$ is the actual value for each point, if it is classified correctly, then the predicted and actual values should be same so their production $y_i(w^Tx+b)$ should be positive (the term >= 0 is substituded by >= 1 because it is a stronger condition)



          The transpose is in order to be able to calculate the dot product. I just wanted to show the logic of dot product hence, didn't write transpose




          For calculating the total distance between lines $w cdot x + b = -1$ and $w cdot x + b = 1$:



          Either you can multiply the calculated margin by 2 Or if you want to directly find it, you can consider a point $alpha$ in line $w cdot x + b = -1$. then we know that the distance between these two lines is twice the value of $gamma$, hence if we want to move from the point z to $alpha$, the total margin (passed length) would be:
          $$z - 2 cdot gamma cdot fracw$$ then we can calculate the margin from here.



          derived from ML course of UCSD by Prof. Sanjoy Dasgupta






          share|improve this answer











          $endgroup$








          • 2




            $begingroup$
            "we know that we have moved from z, in direction of w to point λ" Here z lies in ($w^t.x+b=+1$.) we also know that normal(w) to the plane $pi$ is in direction of positive points right? So aren't we moving from z to point $lambda$ in opposite direction of w?
            $endgroup$
            – user_6396
            Jul 12 at 23:50











          • $begingroup$
            How can I calculate whole margin so that formula becomes $frac2$? It will give me an intuition.
            $endgroup$
            – user_6396
            Jul 13 at 0:15







          • 1




            $begingroup$
            I added it to the bottom of my post.
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 0:58






          • 1




            $begingroup$
            I really appreciate all the help. Thanks :)
            $endgroup$
            – user_6396
            Jul 13 at 1:03






          • 1




            $begingroup$
            I'm happy that I could help. :)
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 1:03















          6












          $begingroup$

          Your understandings are right.




          deriving the margin to be $frac2$




          we know that $w cdot x +b = 1$



          If we move from point z in $w cdot x +b = 1$ to the $w cdot x +b = 0$ we land in a point $lambda$. This line that we have passed or this margin between the two lines $w cdot x +b = 1$ and $w cdot x +b = 0$ is the margin between them which we call $gamma$



          For calculating the margin, we know that we have moved from z, in opposite direction of w to point $lambda$. Hence this margin $gamma$ would be equal to $z - margin cdot fracw = z - gamma cdot fracw =$ (we have moved in the opposite direction of w, we just want the direction so we normalize w to be a unit vector $fracw$)



          Since this $lambda$ point lies in the decision boundary we know that it should suit in line $w cdot x + b = 0$
          Hence we set is in this line in place of x:



          $$w cdot x + b = 0$$
          $$w cdot (z - gamma cdot fracw) + b = 0$$
          $$w cdot z + b - w cdot gamma cdot fracw) = 0$$
          $$w cdot z + b = w cdot gamma cdot fracw$$
          we know that $w cdot z +b = 1$ (z is the point on $w cdot x +b = 1)$
          $$1 = w cdot gamma cdot fracw$$
          $$gamma= frac1w cdot fracw $$
          we also know that $w cdot w = |w|^2$, hence:
          $$gamma= frac1$$
          Why is in your formula 2 instead of 1? because I have calculated the margin between the middle line and the upper, not the whole margin.




          • How can $y_i(w^Tx+b)ge1;;forall;x_i$?



          We want to classify the points in the +1 part as +1 and the points in the -1 part as -1, since $(w^Tx_i+b)$ is the predicted value and $y_i$ is the actual value for each point, if it is classified correctly, then the predicted and actual values should be same so their production $y_i(w^Tx+b)$ should be positive (the term >= 0 is substituded by >= 1 because it is a stronger condition)



          The transpose is in order to be able to calculate the dot product. I just wanted to show the logic of dot product hence, didn't write transpose




          For calculating the total distance between lines $w cdot x + b = -1$ and $w cdot x + b = 1$:



          Either you can multiply the calculated margin by 2 Or if you want to directly find it, you can consider a point $alpha$ in line $w cdot x + b = -1$. then we know that the distance between these two lines is twice the value of $gamma$, hence if we want to move from the point z to $alpha$, the total margin (passed length) would be:
          $$z - 2 cdot gamma cdot fracw$$ then we can calculate the margin from here.



          derived from ML course of UCSD by Prof. Sanjoy Dasgupta






          share|improve this answer











          $endgroup$








          • 2




            $begingroup$
            "we know that we have moved from z, in direction of w to point λ" Here z lies in ($w^t.x+b=+1$.) we also know that normal(w) to the plane $pi$ is in direction of positive points right? So aren't we moving from z to point $lambda$ in opposite direction of w?
            $endgroup$
            – user_6396
            Jul 12 at 23:50











          • $begingroup$
            How can I calculate whole margin so that formula becomes $frac2$? It will give me an intuition.
            $endgroup$
            – user_6396
            Jul 13 at 0:15







          • 1




            $begingroup$
            I added it to the bottom of my post.
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 0:58






          • 1




            $begingroup$
            I really appreciate all the help. Thanks :)
            $endgroup$
            – user_6396
            Jul 13 at 1:03






          • 1




            $begingroup$
            I'm happy that I could help. :)
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 1:03













          6












          6








          6





          $begingroup$

          Your understandings are right.




          deriving the margin to be $frac2$




          we know that $w cdot x +b = 1$



          If we move from point z in $w cdot x +b = 1$ to the $w cdot x +b = 0$ we land in a point $lambda$. This line that we have passed or this margin between the two lines $w cdot x +b = 1$ and $w cdot x +b = 0$ is the margin between them which we call $gamma$



          For calculating the margin, we know that we have moved from z, in opposite direction of w to point $lambda$. Hence this margin $gamma$ would be equal to $z - margin cdot fracw = z - gamma cdot fracw =$ (we have moved in the opposite direction of w, we just want the direction so we normalize w to be a unit vector $fracw$)



          Since this $lambda$ point lies in the decision boundary we know that it should suit in line $w cdot x + b = 0$
          Hence we set is in this line in place of x:



          $$w cdot x + b = 0$$
          $$w cdot (z - gamma cdot fracw) + b = 0$$
          $$w cdot z + b - w cdot gamma cdot fracw) = 0$$
          $$w cdot z + b = w cdot gamma cdot fracw$$
          we know that $w cdot z +b = 1$ (z is the point on $w cdot x +b = 1)$
          $$1 = w cdot gamma cdot fracw$$
          $$gamma= frac1w cdot fracw $$
          we also know that $w cdot w = |w|^2$, hence:
          $$gamma= frac1$$
          Why is in your formula 2 instead of 1? because I have calculated the margin between the middle line and the upper, not the whole margin.




          • How can $y_i(w^Tx+b)ge1;;forall;x_i$?



          We want to classify the points in the +1 part as +1 and the points in the -1 part as -1, since $(w^Tx_i+b)$ is the predicted value and $y_i$ is the actual value for each point, if it is classified correctly, then the predicted and actual values should be same so their production $y_i(w^Tx+b)$ should be positive (the term >= 0 is substituded by >= 1 because it is a stronger condition)



          The transpose is in order to be able to calculate the dot product. I just wanted to show the logic of dot product hence, didn't write transpose




          For calculating the total distance between lines $w cdot x + b = -1$ and $w cdot x + b = 1$:



          Either you can multiply the calculated margin by 2 Or if you want to directly find it, you can consider a point $alpha$ in line $w cdot x + b = -1$. then we know that the distance between these two lines is twice the value of $gamma$, hence if we want to move from the point z to $alpha$, the total margin (passed length) would be:
          $$z - 2 cdot gamma cdot fracw$$ then we can calculate the margin from here.



          derived from ML course of UCSD by Prof. Sanjoy Dasgupta






          share|improve this answer











          $endgroup$



          Your understandings are right.




          deriving the margin to be $frac2$




          we know that $w cdot x +b = 1$



          If we move from point z in $w cdot x +b = 1$ to the $w cdot x +b = 0$ we land in a point $lambda$. This line that we have passed or this margin between the two lines $w cdot x +b = 1$ and $w cdot x +b = 0$ is the margin between them which we call $gamma$



          For calculating the margin, we know that we have moved from z, in opposite direction of w to point $lambda$. Hence this margin $gamma$ would be equal to $z - margin cdot fracw = z - gamma cdot fracw =$ (we have moved in the opposite direction of w, we just want the direction so we normalize w to be a unit vector $fracw$)



          Since this $lambda$ point lies in the decision boundary we know that it should suit in line $w cdot x + b = 0$
          Hence we set is in this line in place of x:



          $$w cdot x + b = 0$$
          $$w cdot (z - gamma cdot fracw) + b = 0$$
          $$w cdot z + b - w cdot gamma cdot fracw) = 0$$
          $$w cdot z + b = w cdot gamma cdot fracw$$
          we know that $w cdot z +b = 1$ (z is the point on $w cdot x +b = 1)$
          $$1 = w cdot gamma cdot fracw$$
          $$gamma= frac1w cdot fracw $$
          we also know that $w cdot w = |w|^2$, hence:
          $$gamma= frac1$$
          Why is in your formula 2 instead of 1? because I have calculated the margin between the middle line and the upper, not the whole margin.




          • How can $y_i(w^Tx+b)ge1;;forall;x_i$?



          We want to classify the points in the +1 part as +1 and the points in the -1 part as -1, since $(w^Tx_i+b)$ is the predicted value and $y_i$ is the actual value for each point, if it is classified correctly, then the predicted and actual values should be same so their production $y_i(w^Tx+b)$ should be positive (the term >= 0 is substituded by >= 1 because it is a stronger condition)



          The transpose is in order to be able to calculate the dot product. I just wanted to show the logic of dot product hence, didn't write transpose




          For calculating the total distance between lines $w cdot x + b = -1$ and $w cdot x + b = 1$:



          Either you can multiply the calculated margin by 2 Or if you want to directly find it, you can consider a point $alpha$ in line $w cdot x + b = -1$. then we know that the distance between these two lines is twice the value of $gamma$, hence if we want to move from the point z to $alpha$, the total margin (passed length) would be:
          $$z - 2 cdot gamma cdot fracw$$ then we can calculate the margin from here.



          derived from ML course of UCSD by Prof. Sanjoy Dasgupta







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jul 13 at 20:23

























          answered Jul 12 at 23:28









          Fatemeh AsgarinejadFatemeh Asgarinejad

          7661 silver badge13 bronze badges




          7661 silver badge13 bronze badges







          • 2




            $begingroup$
            "we know that we have moved from z, in direction of w to point λ" Here z lies in ($w^t.x+b=+1$.) we also know that normal(w) to the plane $pi$ is in direction of positive points right? So aren't we moving from z to point $lambda$ in opposite direction of w?
            $endgroup$
            – user_6396
            Jul 12 at 23:50











          • $begingroup$
            How can I calculate whole margin so that formula becomes $frac2$? It will give me an intuition.
            $endgroup$
            – user_6396
            Jul 13 at 0:15







          • 1




            $begingroup$
            I added it to the bottom of my post.
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 0:58






          • 1




            $begingroup$
            I really appreciate all the help. Thanks :)
            $endgroup$
            – user_6396
            Jul 13 at 1:03






          • 1




            $begingroup$
            I'm happy that I could help. :)
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 1:03












          • 2




            $begingroup$
            "we know that we have moved from z, in direction of w to point λ" Here z lies in ($w^t.x+b=+1$.) we also know that normal(w) to the plane $pi$ is in direction of positive points right? So aren't we moving from z to point $lambda$ in opposite direction of w?
            $endgroup$
            – user_6396
            Jul 12 at 23:50











          • $begingroup$
            How can I calculate whole margin so that formula becomes $frac2$? It will give me an intuition.
            $endgroup$
            – user_6396
            Jul 13 at 0:15







          • 1




            $begingroup$
            I added it to the bottom of my post.
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 0:58






          • 1




            $begingroup$
            I really appreciate all the help. Thanks :)
            $endgroup$
            – user_6396
            Jul 13 at 1:03






          • 1




            $begingroup$
            I'm happy that I could help. :)
            $endgroup$
            – Fatemeh Asgarinejad
            Jul 13 at 1:03







          2




          2




          $begingroup$
          "we know that we have moved from z, in direction of w to point λ" Here z lies in ($w^t.x+b=+1$.) we also know that normal(w) to the plane $pi$ is in direction of positive points right? So aren't we moving from z to point $lambda$ in opposite direction of w?
          $endgroup$
          – user_6396
          Jul 12 at 23:50





          $begingroup$
          "we know that we have moved from z, in direction of w to point λ" Here z lies in ($w^t.x+b=+1$.) we also know that normal(w) to the plane $pi$ is in direction of positive points right? So aren't we moving from z to point $lambda$ in opposite direction of w?
          $endgroup$
          – user_6396
          Jul 12 at 23:50













          $begingroup$
          How can I calculate whole margin so that formula becomes $frac2$? It will give me an intuition.
          $endgroup$
          – user_6396
          Jul 13 at 0:15





          $begingroup$
          How can I calculate whole margin so that formula becomes $frac2$? It will give me an intuition.
          $endgroup$
          – user_6396
          Jul 13 at 0:15





          1




          1




          $begingroup$
          I added it to the bottom of my post.
          $endgroup$
          – Fatemeh Asgarinejad
          Jul 13 at 0:58




          $begingroup$
          I added it to the bottom of my post.
          $endgroup$
          – Fatemeh Asgarinejad
          Jul 13 at 0:58




          1




          1




          $begingroup$
          I really appreciate all the help. Thanks :)
          $endgroup$
          – user_6396
          Jul 13 at 1:03




          $begingroup$
          I really appreciate all the help. Thanks :)
          $endgroup$
          – user_6396
          Jul 13 at 1:03




          1




          1




          $begingroup$
          I'm happy that I could help. :)
          $endgroup$
          – Fatemeh Asgarinejad
          Jul 13 at 1:03




          $begingroup$
          I'm happy that I could help. :)
          $endgroup$
          – Fatemeh Asgarinejad
          Jul 13 at 1:03

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f55592%2fmathematical-formulation-of-support-vector-machines%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Category:9 (number) SubcategoriesMedia in category "9 (number)"Navigation menuUpload mediaGND ID: 4485639-8Library of Congress authority ID: sh85091979ReasonatorScholiaStatistics

          Circuit construction for execution of conditional statements using least significant bitHow are two different registers being used as “control”?How exactly is the stated composite state of the two registers being produced using the $R_zz$ controlled rotations?Efficiently performing controlled rotations in HHLWould this quantum algorithm implementation work?How to prepare a superposed states of odd integers from $1$ to $sqrtN$?Why is this implementation of the order finding algorithm not working?Circuit construction for Hamiltonian simulationHow can I invert the least significant bit of a certain term of a superposed state?Implementing an oracleImplementing a controlled sum operation

          Magento 2 “No Payment Methods” in Admin New OrderHow to integrate Paypal Express Checkout with the Magento APIMagento 1.5 - Sales > Order > edit order and shipping methods disappearAuto Invoice Check/Money Order Payment methodAdd more simple payment methods?Shipping methods not showingWhat should I do to change payment methods if changing the configuration has no effects?1.9 - No Payment Methods showing upMy Payment Methods not Showing for downloadable/virtual product when checkout?Magento2 API to access internal payment methodHow to call an existing payment methods in the registration form?