When to question output of model2019 Community Moderator ElectionFind effective feature on machine learning classification task with scikit-learnClassifying Email in RUsage of Precision Recall on an unbalanced datasetHow to quantify the performance of the classifier (multi-class SVM) using the test data?Precision and Recall if not binaryPoor performance of SVM after training for rare eventsPoor performance for unbalanced datasetHow to calculate Accuracy, Precision, Recall and F1 score based on predict_proba matrix?How to get accuracy, F1, precision and recall, for a keras model?Improve precision of binary classification - SVM in Matlab

Describing a person. What needs to be mentioned?

Abbreviate author names as "Lastname AB" (without space or period) in bibliography

How do we know the LHC results are robust?

Anatomically Correct Strange Women In Ponds Distributing Swords

Why are there no referendums in the US?

Avoiding estate tax by giving multiple gifts

Was Spock the First Vulcan in Starfleet?

Can "Reverse Gravity" affect spells?

What is the intuitive meaning of having a linear relationship between the logs of two variables?

Was a professor correct to chastise me for writing "Prof. X" rather than "Professor X"?

What is the best translation for "slot" in the context of multiplayer video games?

Trouble understanding the speech of overseas colleagues

Why Were Madagascar and New Zealand Discovered So Late?

Is there a good way to store credentials outside of a password manager?

Go Pregnant or Go Home

What is the difference between "behavior" and "behaviour"?

How does buying out courses with grant money work?

How can I quit an app using Terminal?

How do I go from 300 unfinished/half written blog posts, to published posts?

Crossing the line between justified force and brutality

Tiptoe or tiphoof? Adjusting words to better fit fantasy races

Method to test if a number is a perfect power?

Purchasing a ticket for someone else in another country?

Inappropriate reference requests from Journal reviewers



When to question output of model



2019 Community Moderator ElectionFind effective feature on machine learning classification task with scikit-learnClassifying Email in RUsage of Precision Recall on an unbalanced datasetHow to quantify the performance of the classifier (multi-class SVM) using the test data?Precision and Recall if not binaryPoor performance of SVM after training for rare eventsPoor performance for unbalanced datasetHow to calculate Accuracy, Precision, Recall and F1 score based on predict_proba matrix?How to get accuracy, F1, precision and recall, for a keras model?Improve precision of binary classification - SVM in Matlab










0












$begingroup$


I'm unsure of how to ask a question without making it seem like a code review question. At what point does one question whether they've actually implemented the algorithm and-or model correctly? Getting spot-on results is great and all, but seems highly suspect. Also, what checks can be done to ensure that the algorithm and-or model is being implemented correctly? The reason I'm asking is because I'm getting perfect classification and subsequently accuracy, precision, etc. w/ the implementation of SVM.



I am including the code, but feel free to ignore.



# Make a copy of the df
iris_df_copy = iris_df.copy()

# Create a new column, labeled 'T/F', whose value will be based on the value in the 'Class' column. If the value in the
# 'Class' column is 'Iris-setosa', then set the value of the 'T/F' column to 1. If the value in the 'Class' column is
# not 'Iris-setosa', then set the value of the 'T/F' column to 0.
iris_df_copy.loc[iris_df_copy.Class == 'Iris-setosa', 'T/F'] = 1
iris_df_copy.loc[iris_df_copy.Class != 'Iris-setosa', 'T/F'] = 0

X_svm = np.array(iris_df_copy[['Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width']])
y_svm = np.ravel(iris_df_copy[['T/F']])

# Split the samples into two subsets, use one for training and the other for testing
X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(X_svm, y_svm, test_size=0.25, random_state=4)

# Instantiate the learning model - Linear SVM
linear_svm = svm.SVC(kernel='linear')

# Fit the model - Linear SVM
linear_svm.fit(X_train_svm, y_train_svm)

# Predict the response - Linear SVM
linear_svm_pred = linear_svm.predict(X_test_svm)

# Confusion matrix and quantitative metrics - Linear SVM
print("The confusion matrix is: " + np.str(confusion_matrix(y_test_svm, linear_svm_pred)))
print("The accuracy score is: " + np.str(accuracy_score(y_test_svm, linear_svm_pred)))
print("The precision is: " + np.str(precision_score(y_test_svm, linear_svm_pred, average="macro")))
print("The recall is: " + np.str(recall_score(y_test_svm, linear_svm_pred, average="macro")))










share|improve this question







New contributor




user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$
















    0












    $begingroup$


    I'm unsure of how to ask a question without making it seem like a code review question. At what point does one question whether they've actually implemented the algorithm and-or model correctly? Getting spot-on results is great and all, but seems highly suspect. Also, what checks can be done to ensure that the algorithm and-or model is being implemented correctly? The reason I'm asking is because I'm getting perfect classification and subsequently accuracy, precision, etc. w/ the implementation of SVM.



    I am including the code, but feel free to ignore.



    # Make a copy of the df
    iris_df_copy = iris_df.copy()

    # Create a new column, labeled 'T/F', whose value will be based on the value in the 'Class' column. If the value in the
    # 'Class' column is 'Iris-setosa', then set the value of the 'T/F' column to 1. If the value in the 'Class' column is
    # not 'Iris-setosa', then set the value of the 'T/F' column to 0.
    iris_df_copy.loc[iris_df_copy.Class == 'Iris-setosa', 'T/F'] = 1
    iris_df_copy.loc[iris_df_copy.Class != 'Iris-setosa', 'T/F'] = 0

    X_svm = np.array(iris_df_copy[['Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width']])
    y_svm = np.ravel(iris_df_copy[['T/F']])

    # Split the samples into two subsets, use one for training and the other for testing
    X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(X_svm, y_svm, test_size=0.25, random_state=4)

    # Instantiate the learning model - Linear SVM
    linear_svm = svm.SVC(kernel='linear')

    # Fit the model - Linear SVM
    linear_svm.fit(X_train_svm, y_train_svm)

    # Predict the response - Linear SVM
    linear_svm_pred = linear_svm.predict(X_test_svm)

    # Confusion matrix and quantitative metrics - Linear SVM
    print("The confusion matrix is: " + np.str(confusion_matrix(y_test_svm, linear_svm_pred)))
    print("The accuracy score is: " + np.str(accuracy_score(y_test_svm, linear_svm_pred)))
    print("The precision is: " + np.str(precision_score(y_test_svm, linear_svm_pred, average="macro")))
    print("The recall is: " + np.str(recall_score(y_test_svm, linear_svm_pred, average="macro")))










    share|improve this question







    New contributor




    user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$














      0












      0








      0





      $begingroup$


      I'm unsure of how to ask a question without making it seem like a code review question. At what point does one question whether they've actually implemented the algorithm and-or model correctly? Getting spot-on results is great and all, but seems highly suspect. Also, what checks can be done to ensure that the algorithm and-or model is being implemented correctly? The reason I'm asking is because I'm getting perfect classification and subsequently accuracy, precision, etc. w/ the implementation of SVM.



      I am including the code, but feel free to ignore.



      # Make a copy of the df
      iris_df_copy = iris_df.copy()

      # Create a new column, labeled 'T/F', whose value will be based on the value in the 'Class' column. If the value in the
      # 'Class' column is 'Iris-setosa', then set the value of the 'T/F' column to 1. If the value in the 'Class' column is
      # not 'Iris-setosa', then set the value of the 'T/F' column to 0.
      iris_df_copy.loc[iris_df_copy.Class == 'Iris-setosa', 'T/F'] = 1
      iris_df_copy.loc[iris_df_copy.Class != 'Iris-setosa', 'T/F'] = 0

      X_svm = np.array(iris_df_copy[['Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width']])
      y_svm = np.ravel(iris_df_copy[['T/F']])

      # Split the samples into two subsets, use one for training and the other for testing
      X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(X_svm, y_svm, test_size=0.25, random_state=4)

      # Instantiate the learning model - Linear SVM
      linear_svm = svm.SVC(kernel='linear')

      # Fit the model - Linear SVM
      linear_svm.fit(X_train_svm, y_train_svm)

      # Predict the response - Linear SVM
      linear_svm_pred = linear_svm.predict(X_test_svm)

      # Confusion matrix and quantitative metrics - Linear SVM
      print("The confusion matrix is: " + np.str(confusion_matrix(y_test_svm, linear_svm_pred)))
      print("The accuracy score is: " + np.str(accuracy_score(y_test_svm, linear_svm_pred)))
      print("The precision is: " + np.str(precision_score(y_test_svm, linear_svm_pred, average="macro")))
      print("The recall is: " + np.str(recall_score(y_test_svm, linear_svm_pred, average="macro")))










      share|improve this question







      New contributor




      user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I'm unsure of how to ask a question without making it seem like a code review question. At what point does one question whether they've actually implemented the algorithm and-or model correctly? Getting spot-on results is great and all, but seems highly suspect. Also, what checks can be done to ensure that the algorithm and-or model is being implemented correctly? The reason I'm asking is because I'm getting perfect classification and subsequently accuracy, precision, etc. w/ the implementation of SVM.



      I am including the code, but feel free to ignore.



      # Make a copy of the df
      iris_df_copy = iris_df.copy()

      # Create a new column, labeled 'T/F', whose value will be based on the value in the 'Class' column. If the value in the
      # 'Class' column is 'Iris-setosa', then set the value of the 'T/F' column to 1. If the value in the 'Class' column is
      # not 'Iris-setosa', then set the value of the 'T/F' column to 0.
      iris_df_copy.loc[iris_df_copy.Class == 'Iris-setosa', 'T/F'] = 1
      iris_df_copy.loc[iris_df_copy.Class != 'Iris-setosa', 'T/F'] = 0

      X_svm = np.array(iris_df_copy[['Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width']])
      y_svm = np.ravel(iris_df_copy[['T/F']])

      # Split the samples into two subsets, use one for training and the other for testing
      X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(X_svm, y_svm, test_size=0.25, random_state=4)

      # Instantiate the learning model - Linear SVM
      linear_svm = svm.SVC(kernel='linear')

      # Fit the model - Linear SVM
      linear_svm.fit(X_train_svm, y_train_svm)

      # Predict the response - Linear SVM
      linear_svm_pred = linear_svm.predict(X_test_svm)

      # Confusion matrix and quantitative metrics - Linear SVM
      print("The confusion matrix is: " + np.str(confusion_matrix(y_test_svm, linear_svm_pred)))
      print("The accuracy score is: " + np.str(accuracy_score(y_test_svm, linear_svm_pred)))
      print("The precision is: " + np.str(precision_score(y_test_svm, linear_svm_pred, average="macro")))
      print("The recall is: " + np.str(recall_score(y_test_svm, linear_svm_pred, average="macro")))







      machine-learning scikit-learn svm






      share|improve this question







      New contributor




      user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Mar 22 at 22:39









      user3727648user3727648

      31




      31




      New contributor




      user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'



          Linear SVM finds a plane to cut through the data to best represent the difference between two sets.



          If you have a look at what you are separating (Iris_setosa from Iris_virginica and iris_versicolor), you'll find that the clumps themselves are perfectly separated. You can draw a line easily on each graph you care to use, and that is what I have done in the picture below. If the clumps are perfectly separated, then the SVM will return a perfectly separated result.
          enter image description here
          By Nicoguaro - Own work, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=46257808



          Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.






          share|improve this answer









          $endgroup$












            Your Answer





            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "557"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );






            user3727648 is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47814%2fwhen-to-question-output-of-model%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0












            $begingroup$

            You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'



            Linear SVM finds a plane to cut through the data to best represent the difference between two sets.



            If you have a look at what you are separating (Iris_setosa from Iris_virginica and iris_versicolor), you'll find that the clumps themselves are perfectly separated. You can draw a line easily on each graph you care to use, and that is what I have done in the picture below. If the clumps are perfectly separated, then the SVM will return a perfectly separated result.
            enter image description here
            By Nicoguaro - Own work, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=46257808



            Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.






            share|improve this answer









            $endgroup$

















              0












              $begingroup$

              You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'



              Linear SVM finds a plane to cut through the data to best represent the difference between two sets.



              If you have a look at what you are separating (Iris_setosa from Iris_virginica and iris_versicolor), you'll find that the clumps themselves are perfectly separated. You can draw a line easily on each graph you care to use, and that is what I have done in the picture below. If the clumps are perfectly separated, then the SVM will return a perfectly separated result.
              enter image description here
              By Nicoguaro - Own work, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=46257808



              Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.






              share|improve this answer









              $endgroup$















                0












                0








                0





                $begingroup$

                You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'



                Linear SVM finds a plane to cut through the data to best represent the difference between two sets.



                If you have a look at what you are separating (Iris_setosa from Iris_virginica and iris_versicolor), you'll find that the clumps themselves are perfectly separated. You can draw a line easily on each graph you care to use, and that is what I have done in the picture below. If the clumps are perfectly separated, then the SVM will return a perfectly separated result.
                enter image description here
                By Nicoguaro - Own work, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=46257808



                Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.






                share|improve this answer









                $endgroup$



                You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'



                Linear SVM finds a plane to cut through the data to best represent the difference between two sets.



                If you have a look at what you are separating (Iris_setosa from Iris_virginica and iris_versicolor), you'll find that the clumps themselves are perfectly separated. You can draw a line easily on each graph you care to use, and that is what I have done in the picture below. If the clumps are perfectly separated, then the SVM will return a perfectly separated result.
                enter image description here
                By Nicoguaro - Own work, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=46257808



                Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 23 at 0:15









                IngolifsIngolifs

                2187




                2187




















                    user3727648 is a new contributor. Be nice, and check out our Code of Conduct.









                    draft saved

                    draft discarded


















                    user3727648 is a new contributor. Be nice, and check out our Code of Conduct.












                    user3727648 is a new contributor. Be nice, and check out our Code of Conduct.











                    user3727648 is a new contributor. Be nice, and check out our Code of Conduct.














                    Thanks for contributing an answer to Data Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47814%2fwhen-to-question-output-of-model%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Marja Vauras Lähteet | Aiheesta muualla | NavigointivalikkoMarja Vauras Turun yliopiston tutkimusportaalissaInfobox OKSuomalaisen Tiedeakatemian varsinaiset jäsenetKasvatustieteiden tiedekunnan dekaanit ja muu johtoMarja VaurasKoulutusvienti on kestävyys- ja ketteryyslaji (2.5.2017)laajentamallaWorldCat Identities0000 0001 0855 9405n86069603utb201588738523620927

                    Which is better: GPT or RelGAN for text generation?2019 Community Moderator ElectionWhat is the difference between TextGAN and LM for text generation?GANs (generative adversarial networks) possible for text as well?Generator loss not decreasing- text to image synthesisChoosing a right algorithm for template-based text generationHow should I format input and output for text generation with LSTMsGumbel Softmax vs Vanilla Softmax for GAN trainingWhich neural network to choose for classification from text/speech?NLP text autoencoder that generates text in poetic meterWhat is the interpretation of the expectation notation in the GAN formulation?What is the difference between TextGAN and LM for text generation?How to prepare the data for text generation task

                    Is this part of the description of the Archfey warlock's Misty Escape feature redundant?When is entropic ward considered “used”?How does the reaction timing work for Wrath of the Storm? Can it potentially prevent the damage from the triggering attack?Does the Dark Arts Archlich warlock patrons's Arcane Invisibility activate every time you cast a level 1+ spell?When attacking while invisible, when exactly does invisibility break?Can I cast Hellish Rebuke on my turn?Do I have to “pre-cast” a reaction spell in order for it to be triggered?What happens if a Player Misty Escapes into an Invisible CreatureCan a reaction interrupt multiattack?Does the Fiend-patron warlock's Hurl Through Hell feature dispel effects that require the target to be on the same plane as the caster?What are you allowed to do while using the Warlock's Eldritch Master feature?