Model Performance using Precision as evaluation metric The Next CEO of Stack Overflow2019 Community Moderator ElectionCorrelation as an evaluation metric for regressionTesting Model PerformanceROC curves/AUC values as a performance metricConvolution Neural Network Loss and performancePerformance Evaluation Metrics used in Training, Validation and TestingXGBoost evaluation metric unbalanced data - custom eval metricclassification performance metric for high risk medical decisionsWhat is the best performance metric used in balancing dataset using SMOTE techniqueRegression model performance with noisy dependent variableHow reliable are model performance reportings?

Is HostGator storing my password in plaintext?

At which OSI layer a user-generated data resides?

Extending anchors in TikZ

How do I construct this japanese bowl?

Hindi speaking tourist to UK from India

WOW air has ceased operation, can I get my tickets refunded?

How did people program for Consoles with multiple CPUs?

Can the Reverse Gravity spell affect the Meteor Swarm spell?

Would a galaxy be visible from outside, but nearby?

Why didn't Theresa May consult with Parliament before negotiating a deal with the EU?

What is the difference between Sanyaas and Vairagya?

Monthly twice production release for my software project

Clustering points and summing up attributes per cluster in QGIS

Extracting names from filename in bash

Complex fractions

Is it my responsibility to learn a new technology in my own time my employer wants to implement?

Why does standard notation not preserve intervals (visually)

How to make a software documentation "officially" citable?

Anatomically Correct Strange Women In Ponds Distributing Swords

Can a caster that cast Polymorph on themselves stop concentrating at any point even if their Int is low?

How to safely derail a train during transit?

How to write papers efficiently when English isn't my first language?

Inappropriate reference requests from Journal reviewers

Only print output after finding pattern



Model Performance using Precision as evaluation metric



The Next CEO of Stack Overflow
2019 Community Moderator ElectionCorrelation as an evaluation metric for regressionTesting Model PerformanceROC curves/AUC values as a performance metricConvolution Neural Network Loss and performancePerformance Evaluation Metrics used in Training, Validation and TestingXGBoost evaluation metric unbalanced data - custom eval metricclassification performance metric for high risk medical decisionsWhat is the best performance metric used in balancing dataset using SMOTE techniqueRegression model performance with noisy dependent variableHow reliable are model performance reportings?










1












$begingroup$


I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)



0 : 91%



1 : 9%



To build model on this dataset having class imbalance, I have compared results using



1) SMOTE and



2) Assigning more weight to the minority class when applying fit



and the latter seems to be working better.



After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)



enter image description here



The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.



Here's my XGBClassifer's code :



xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)


And here's the code for model_fit :



def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))


Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.










share|improve this question









$endgroup$











  • $begingroup$
    What is the type of your data? is it images?
    $endgroup$
    – honar.cs
    Mar 22 at 17:27










  • $begingroup$
    No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
    $endgroup$
    – Shekhar Tanwar
    Mar 22 at 17:31















1












$begingroup$


I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)



0 : 91%



1 : 9%



To build model on this dataset having class imbalance, I have compared results using



1) SMOTE and



2) Assigning more weight to the minority class when applying fit



and the latter seems to be working better.



After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)



enter image description here



The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.



Here's my XGBClassifer's code :



xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)


And here's the code for model_fit :



def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))


Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.










share|improve this question









$endgroup$











  • $begingroup$
    What is the type of your data? is it images?
    $endgroup$
    – honar.cs
    Mar 22 at 17:27










  • $begingroup$
    No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
    $endgroup$
    – Shekhar Tanwar
    Mar 22 at 17:31













1












1








1





$begingroup$


I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)



0 : 91%



1 : 9%



To build model on this dataset having class imbalance, I have compared results using



1) SMOTE and



2) Assigning more weight to the minority class when applying fit



and the latter seems to be working better.



After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)



enter image description here



The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.



Here's my XGBClassifer's code :



xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)


And here's the code for model_fit :



def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))


Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.










share|improve this question









$endgroup$




I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)



0 : 91%



1 : 9%



To build model on this dataset having class imbalance, I have compared results using



1) SMOTE and



2) Assigning more weight to the minority class when applying fit



and the latter seems to be working better.



After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)



enter image description here



The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.



Here's my XGBClassifer's code :



xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)


And here's the code for model_fit :



def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))


Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.







machine-learning classification xgboost performance






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 22 at 17:12









Shekhar TanwarShekhar Tanwar

111




111











  • $begingroup$
    What is the type of your data? is it images?
    $endgroup$
    – honar.cs
    Mar 22 at 17:27










  • $begingroup$
    No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
    $endgroup$
    – Shekhar Tanwar
    Mar 22 at 17:31
















  • $begingroup$
    What is the type of your data? is it images?
    $endgroup$
    – honar.cs
    Mar 22 at 17:27










  • $begingroup$
    No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
    $endgroup$
    – Shekhar Tanwar
    Mar 22 at 17:31















$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27




$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27












$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31




$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31










0






active

oldest

votes












Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47804%2fmodel-performance-using-precision-as-evaluation-metric%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47804%2fmodel-performance-using-precision-as-evaluation-metric%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High