When to question output of model2019 Community Moderator ElectionFind effective feature on machine learning classification task with scikit-learnClassifying Email in RUsage of Precision Recall on an unbalanced datasetHow to quantify the performance of the classifier (multi-class SVM) using the test data?Precision and Recall if not binaryPoor performance of SVM after training for rare eventsPoor performance for unbalanced datasetHow to calculate Accuracy, Precision, Recall and F1 score based on predict_proba matrix?How to get accuracy, F1, precision and recall, for a keras model?Improve precision of binary classification

When to question output of model2019 Community Moderator ElectionFind effective feature on machine learning classification task with scikit-learnClassifying Email in RUsage of Precision Recall on an unbalanced datasetHow to quantify the performance of the classifier (multi-class SVM) using the test data?Precision and Recall if not binaryPoor performance of SVM after training for rare eventsPoor performance for unbalanced datasetHow to calculate Accuracy, Precision, Recall and F1 score based on predict_proba matrix?How to get accuracy, F1, precision and recall, for a keras model?Improve precision of binary classification - SVM in Matlab

Describing a person. What needs to be mentioned?

Abbreviate author names as "Lastname AB" (without space or period) in bibliography

How do we know the LHC results are robust?

Anatomically Correct Strange Women In Ponds Distributing Swords

Why are there no referendums in the US?

Avoiding estate tax by giving multiple gifts

Was Spock the First Vulcan in Starfleet?

Can "Reverse Gravity" affect spells?

What is the intuitive meaning of having a linear relationship between the logs of two variables?

Was a professor correct to chastise me for writing "Prof. X" rather than "Professor X"?

What is the best translation for "slot" in the context of multiplayer video games?

Trouble understanding the speech of overseas colleagues

Why Were Madagascar and New Zealand Discovered So Late?

Is there a good way to store credentials outside of a password manager?

Go Pregnant or Go Home

What is the difference between "behavior" and "behaviour"?

How does buying out courses with grant money work?

How can I quit an app using Terminal?

How do I go from 300 unfinished/half written blog posts, to published posts?

Crossing the line between justified force and brutality

Tiptoe or tiphoof? Adjusting words to better fit fantasy races

Method to test if a number is a perfect power?

Purchasing a ticket for someone else in another country?

Inappropriate reference requests from Journal reviewers

When to question output of model

2019 Community Moderator ElectionFind effective feature on machine learning classification task with scikit-learnClassifying Email in RUsage of Precision Recall on an unbalanced datasetHow to quantify the performance of the classifier (multi-class SVM) using the test data?Precision and Recall if not binaryPoor performance of SVM after training for rare eventsPoor performance for unbalanced datasetHow to calculate Accuracy, Precision, Recall and F1 score based on predict_proba matrix?How to get accuracy, F1, precision and recall, for a keras model?Improve precision of binary classification - SVM in Matlab

I'm unsure of how to ask a question without making it seem like a code review question. At what point does one question whether they've actually implemented the algorithm and-or model correctly? Getting spot-on results is great and all, but seems highly suspect. Also, what checks can be done to ensure that the algorithm and-or model is being implemented correctly? The reason I'm asking is because I'm getting perfect classification and subsequently accuracy, precision, etc. w/ the implementation of SVM.

I am including the code, but feel free to ignore.

# Make a copy of the df
iris_df_copy = iris_df.copy()

# Create a new column, labeled 'T/F', whose value will be based on the value in the 'Class' column. If the value in the
# 'Class' column is 'Iris-setosa', then set the value of the 'T/F' column to 1. If the value in the 'Class' column is
# not 'Iris-setosa', then set the value of the 'T/F' column to 0.
iris_df_copy.loc[iris_df_copy.Class == 'Iris-setosa', 'T/F'] = 1
iris_df_copy.loc[iris_df_copy.Class != 'Iris-setosa', 'T/F'] = 0

X_svm = np.array(iris_df_copy[['Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width']])
y_svm = np.ravel(iris_df_copy[['T/F']])

# Split the samples into two subsets, use one for training and the other for testing
X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(X_svm, y_svm, test_size=0.25, random_state=4)

# Instantiate the learning model - Linear SVM
linear_svm = svm.SVC(kernel='linear')

# Fit the model - Linear SVM
linear_svm.fit(X_train_svm, y_train_svm)

# Predict the response - Linear SVM
linear_svm_pred = linear_svm.predict(X_test_svm)

# Confusion matrix and quantitative metrics - Linear SVM
print("The confusion matrix is: " + np.str(confusion_matrix(y_test_svm, linear_svm_pred)))
print("The accuracy score is: " + np.str(accuracy_score(y_test_svm, linear_svm_pred)))
print("The precision is: " + np.str(precision_score(y_test_svm, linear_svm_pred, average="macro")))
print("The recall is: " + np.str(recall_score(y_test_svm, linear_svm_pred, average="macro")))

asked Mar 22 at 22:39

user3727648

New contributor

add a comment |

I am including the code, but feel free to ignore.

# Make a copy of the df
iris_df_copy = iris_df.copy()

# Create a new column, labeled 'T/F', whose value will be based on the value in the 'Class' column. If the value in the
# 'Class' column is 'Iris-setosa', then set the value of the 'T/F' column to 1. If the value in the 'Class' column is
# not 'Iris-setosa', then set the value of the 'T/F' column to 0.
iris_df_copy.loc[iris_df_copy.Class == 'Iris-setosa', 'T/F'] = 1
iris_df_copy.loc[iris_df_copy.Class != 'Iris-setosa', 'T/F'] = 0

X_svm = np.array(iris_df_copy[['Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width']])
y_svm = np.ravel(iris_df_copy[['T/F']])

# Split the samples into two subsets, use one for training and the other for testing
X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(X_svm, y_svm, test_size=0.25, random_state=4)

# Instantiate the learning model - Linear SVM
linear_svm = svm.SVC(kernel='linear')

# Fit the model - Linear SVM
linear_svm.fit(X_train_svm, y_train_svm)

# Predict the response - Linear SVM
linear_svm_pred = linear_svm.predict(X_test_svm)

# Confusion matrix and quantitative metrics - Linear SVM
print("The confusion matrix is: " + np.str(confusion_matrix(y_test_svm, linear_svm_pred)))
print("The accuracy score is: " + np.str(accuracy_score(y_test_svm, linear_svm_pred)))
print("The precision is: " + np.str(precision_score(y_test_svm, linear_svm_pred, average="macro")))
print("The recall is: " + np.str(recall_score(y_test_svm, linear_svm_pred, average="macro")))

asked Mar 22 at 22:39

user3727648

New contributor

add a comment |

I am including the code, but feel free to ignore.

# Make a copy of the df
iris_df_copy = iris_df.copy()

# Create a new column, labeled 'T/F', whose value will be based on the value in the 'Class' column. If the value in the
# 'Class' column is 'Iris-setosa', then set the value of the 'T/F' column to 1. If the value in the 'Class' column is
# not 'Iris-setosa', then set the value of the 'T/F' column to 0.
iris_df_copy.loc[iris_df_copy.Class == 'Iris-setosa', 'T/F'] = 1
iris_df_copy.loc[iris_df_copy.Class != 'Iris-setosa', 'T/F'] = 0

X_svm = np.array(iris_df_copy[['Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width']])
y_svm = np.ravel(iris_df_copy[['T/F']])

# Split the samples into two subsets, use one for training and the other for testing
X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(X_svm, y_svm, test_size=0.25, random_state=4)

# Instantiate the learning model - Linear SVM
linear_svm = svm.SVC(kernel='linear')

# Fit the model - Linear SVM
linear_svm.fit(X_train_svm, y_train_svm)

# Predict the response - Linear SVM
linear_svm_pred = linear_svm.predict(X_test_svm)

# Confusion matrix and quantitative metrics - Linear SVM
print("The confusion matrix is: " + np.str(confusion_matrix(y_test_svm, linear_svm_pred)))
print("The accuracy score is: " + np.str(accuracy_score(y_test_svm, linear_svm_pred)))
print("The precision is: " + np.str(precision_score(y_test_svm, linear_svm_pred, average="macro")))
print("The recall is: " + np.str(recall_score(y_test_svm, linear_svm_pred, average="macro")))

asked Mar 22 at 22:39

user3727648

New contributor

I am including the code, but feel free to ignore.

# Make a copy of the df
iris_df_copy = iris_df.copy()

# Create a new column, labeled 'T/F', whose value will be based on the value in the 'Class' column. If the value in the
# 'Class' column is 'Iris-setosa', then set the value of the 'T/F' column to 1. If the value in the 'Class' column is
# not 'Iris-setosa', then set the value of the 'T/F' column to 0.
iris_df_copy.loc[iris_df_copy.Class == 'Iris-setosa', 'T/F'] = 1
iris_df_copy.loc[iris_df_copy.Class != 'Iris-setosa', 'T/F'] = 0

X_svm = np.array(iris_df_copy[['Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width']])
y_svm = np.ravel(iris_df_copy[['T/F']])

# Split the samples into two subsets, use one for training and the other for testing
X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(X_svm, y_svm, test_size=0.25, random_state=4)

# Instantiate the learning model - Linear SVM
linear_svm = svm.SVC(kernel='linear')

# Fit the model - Linear SVM
linear_svm.fit(X_train_svm, y_train_svm)

# Predict the response - Linear SVM
linear_svm_pred = linear_svm.predict(X_test_svm)

# Confusion matrix and quantitative metrics - Linear SVM
print("The confusion matrix is: " + np.str(confusion_matrix(y_test_svm, linear_svm_pred)))
print("The accuracy score is: " + np.str(accuracy_score(y_test_svm, linear_svm_pred)))
print("The precision is: " + np.str(precision_score(y_test_svm, linear_svm_pred, average="macro")))
print("The recall is: " + np.str(recall_score(y_test_svm, linear_svm_pred, average="macro")))

machine-learning scikit-learn svm

asked Mar 22 at 22:39

user3727648

New contributor

asked Mar 22 at 22:39

user3727648

New contributor

asked Mar 22 at 22:39

user3727648

New contributor

asked Mar 22 at 22:39

user3727648

asked Mar 22 at 22:39

user3727648

New contributor

user3727648 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

1 Answer
1

active

oldest

votes

You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'

Linear SVM finds a plane to cut through the data to best represent the difference between two sets.

If you have a look at what you are separating (Iris_setosa from Iris_virginica and iris_versicolor), you'll find that the clumps themselves are perfectly separated. You can draw a line easily on each graph you care to use, and that is what I have done in the picture below. If the clumps are perfectly separated, then the SVM will return a perfectly separated result.
enter image description here
By Nicoguaro - Own work, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=46257808

Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.

answered Mar 23 at 0:15

Ingolifs

2187

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

user3727648 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47814%2fwhen-to-question-output-of-model%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'

Linear SVM finds a plane to cut through the data to best represent the difference between two sets.

Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.

answered Mar 23 at 0:15

Ingolifs

2187

add a comment |

You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'

Linear SVM finds a plane to cut through the data to best represent the difference between two sets.

Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.

answered Mar 23 at 0:15

Ingolifs

2187

add a comment |

You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'

Linear SVM finds a plane to cut through the data to best represent the difference between two sets.

Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.

answered Mar 23 at 0:15

Ingolifs

2187

You need to know what the outcome should be of a given test on a dataset before you try to test a new method on them. Ask yourself, 'What do I expect from this?'

Linear SVM finds a plane to cut through the data to best represent the difference between two sets.

Test the SVM on separating virginica and versicolor to see how it does in a more difficult context. Or alternatively, just generate a dataset of your own from randomly placed gaussian points.

answered Mar 23 at 0:15

Ingolifs

2187

answered Mar 23 at 0:15

Ingolifs

2187

answered Mar 23 at 0:15

Ingolifs

2187

answered Mar 23 at 0:15

Ingolifs

2187

add a comment |

user3727648 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

user3727648 is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Trjtdtk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

1 Answer
1

1 Answer
1

1 Answer
1