Model Performance using Precision as evaluation metric The Next CEO of Stack Overflow2019 Community Moderator ElectionCorrelation as an evaluation metric for regressionTesting Model PerformanceROC curves/AUC values as a performance metricConvolution Neural Network Loss and performancePerformance Evaluation Metrics used in Training, Validation and TestingXGBoost evaluation metric unbalanced data - custom eval metricclassification performance metric for high risk medical decisionsWhat is the best performance metric used in balancing dataset using SMOTE techniqueRegression model performance with noisy dependent variableHow reliable are model performance reportings?

Is HostGator storing my password in plaintext?

At which OSI layer a user-generated data resides?

Extending anchors in TikZ

How do I construct this japanese bowl?

Hindi speaking tourist to UK from India

WOW air has ceased operation, can I get my tickets refunded?

How did people program for Consoles with multiple CPUs?

Can the Reverse Gravity spell affect the Meteor Swarm spell?

Would a galaxy be visible from outside, but nearby?

Why didn't Theresa May consult with Parliament before negotiating a deal with the EU?

What is the difference between Sanyaas and Vairagya?

Monthly twice production release for my software project

Clustering points and summing up attributes per cluster in QGIS

Extracting names from filename in bash

Complex fractions

Is it my responsibility to learn a new technology in my own time my employer wants to implement?

Why does standard notation not preserve intervals (visually)

How to make a software documentation "officially" citable?

Anatomically Correct Strange Women In Ponds Distributing Swords

Can a caster that cast Polymorph on themselves stop concentrating at any point even if their Int is low?

How to safely derail a train during transit?

How to write papers efficiently when English isn't my first language?

Inappropriate reference requests from Journal reviewers

Only print output after finding pattern

Model Performance using Precision as evaluation metric

The Next CEO of Stack Overflow

2019 Community Moderator ElectionCorrelation as an evaluation metric for regressionTesting Model PerformanceROC curves/AUC values as a performance metricConvolution Neural Network Loss and performancePerformance Evaluation Metrics used in Training, Validation and TestingXGBoost evaluation metric unbalanced data - custom eval metricclassification performance metric for high risk medical decisionsWhat is the best performance metric used in balancing dataset using SMOTE techniqueRegression model performance with noisy dependent variableHow reliable are model performance reportings?

I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)

0 : 91%

1 : 9%

To build model on this dataset having class imbalance, I have compared results using

1) SMOTE and

2) Assigning more weight to the minority class when applying fit

and the latter seems to be working better.

After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)

enter image description here

The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.

Here's my XGBClassifer's code :

xgb3 = XGBClassifier(
 learning_rate =0.01,
 n_estimators=2000,
 max_depth=15,
 min_child_weight=6,
 gamma=0.4,
 subsample=0.8,
 colsample_bytree=0.8,
 reg_alpha=0.005,
 objective= 'binary:logistic',
 nthread=4,
 scale_pos_weight=10,
 eval_metrics = 'logloss',
 seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)

And here's the code for model_fit :

def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
 xgb_param = algorithm.get_xgb_params()
 xgbtrain = xgb.DMatrix(X_train,label = y_train)
 xgbtest = xgb.DMatrix(X_test)
 cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
 nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
 algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss') 
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))

Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.

asked Mar 22 at 17:12

Shekhar Tanwar

111

$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27

$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31

add a comment |

I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)

0 : 91%

1 : 9%

To build model on this dataset having class imbalance, I have compared results using

1) SMOTE and

2) Assigning more weight to the minority class when applying fit

and the latter seems to be working better.

After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)

enter image description here

The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.

Here's my XGBClassifer's code :

xgb3 = XGBClassifier(
 learning_rate =0.01,
 n_estimators=2000,
 max_depth=15,
 min_child_weight=6,
 gamma=0.4,
 subsample=0.8,
 colsample_bytree=0.8,
 reg_alpha=0.005,
 objective= 'binary:logistic',
 nthread=4,
 scale_pos_weight=10,
 eval_metrics = 'logloss',
 seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)

And here's the code for model_fit :

def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
 xgb_param = algorithm.get_xgb_params()
 xgbtrain = xgb.DMatrix(X_train,label = y_train)
 xgbtest = xgb.DMatrix(X_test)
 cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
 nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
 algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss') 
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))

Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.

asked Mar 22 at 17:12

Shekhar Tanwar

111

$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27

$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31

add a comment |

I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)

0 : 91%

1 : 9%

To build model on this dataset having class imbalance, I have compared results using

1) SMOTE and

2) Assigning more weight to the minority class when applying fit

and the latter seems to be working better.

After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)

enter image description here

The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.

Here's my XGBClassifer's code :

xgb3 = XGBClassifier(
 learning_rate =0.01,
 n_estimators=2000,
 max_depth=15,
 min_child_weight=6,
 gamma=0.4,
 subsample=0.8,
 colsample_bytree=0.8,
 reg_alpha=0.005,
 objective= 'binary:logistic',
 nthread=4,
 scale_pos_weight=10,
 eval_metrics = 'logloss',
 seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)

And here's the code for model_fit :

def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
 xgb_param = algorithm.get_xgb_params()
 xgbtrain = xgb.DMatrix(X_train,label = y_train)
 xgbtest = xgb.DMatrix(X_test)
 cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
 nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
 algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss') 
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))

Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.

asked Mar 22 at 17:12

Shekhar Tanwar

111

I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)

0 : 91%

1 : 9%

To build model on this dataset having class imbalance, I have compared results using

1) SMOTE and

2) Assigning more weight to the minority class when applying fit

and the latter seems to be working better.

After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)

enter image description here

The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.

Here's my XGBClassifer's code :

xgb3 = XGBClassifier(
 learning_rate =0.01,
 n_estimators=2000,
 max_depth=15,
 min_child_weight=6,
 gamma=0.4,
 subsample=0.8,
 colsample_bytree=0.8,
 reg_alpha=0.005,
 objective= 'binary:logistic',
 nthread=4,
 scale_pos_weight=10,
 eval_metrics = 'logloss',
 seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)

And here's the code for model_fit :

def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
 xgb_param = algorithm.get_xgb_params()
 xgbtrain = xgb.DMatrix(X_train,label = y_train)
 xgbtest = xgb.DMatrix(X_test)
 cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
 nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
 algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss') 
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))

Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.

machine-learning classification xgboost performance

asked Mar 22 at 17:12

Shekhar Tanwar

111

asked Mar 22 at 17:12

Shekhar Tanwar

111

asked Mar 22 at 17:12

Shekhar Tanwar

111

asked Mar 22 at 17:12

Shekhar Tanwar

111

asked Mar 22 at 17:12

Shekhar Tanwar

111

$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27

$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31

add a comment |

$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27

$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31

What is the type of your data? is it images?

– honar.cs
Mar 22 at 17:27

No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.

– Shekhar Tanwar
Mar 22 at 17:31

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47804%2fmodel-performance-using-precision-as-evaluation-metric%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

4Nsp WgF VnyFe,iXD7TVzeVlCUFLwMMDPQe2XwfRCGgZE,G3W7souR8

搜尋此網誌

Trjtdtk

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli