Model Performance using Precision as evaluation metric The Next CEO of Stack Overflow2019 Community Moderator ElectionCorrelation as an evaluation metric for regressionTesting Model PerformanceROC curves/AUC values as a performance metricConvolution Neural Network Loss and performancePerformance Evaluation Metrics used in Training, Validation and TestingXGBoost evaluation metric unbalanced data - custom eval metricclassification performance metric for high risk medical decisionsWhat is the best performance metric used in balancing dataset using SMOTE techniqueRegression model performance with noisy dependent variableHow reliable are model performance reportings?

Is HostGator storing my password in plaintext?

At which OSI layer a user-generated data resides?

Extending anchors in TikZ

How do I construct this japanese bowl?

Hindi speaking tourist to UK from India

WOW air has ceased operation, can I get my tickets refunded?

How did people program for Consoles with multiple CPUs?

Can the Reverse Gravity spell affect the Meteor Swarm spell?

Would a galaxy be visible from outside, but nearby?

Why didn't Theresa May consult with Parliament before negotiating a deal with the EU?

What is the difference between Sanyaas and Vairagya?

Monthly twice production release for my software project

Clustering points and summing up attributes per cluster in QGIS

Extracting names from filename in bash

Complex fractions

Is it my responsibility to learn a new technology in my own time my employer wants to implement?

Why does standard notation not preserve intervals (visually)

How to make a software documentation "officially" citable?

Anatomically Correct Strange Women In Ponds Distributing Swords

Can a caster that cast Polymorph on themselves stop concentrating at any point even if their Int is low?

How to safely derail a train during transit?

How to write papers efficiently when English isn't my first language?

Inappropriate reference requests from Journal reviewers

Only print output after finding pattern



Model Performance using Precision as evaluation metric



The Next CEO of Stack Overflow
2019 Community Moderator ElectionCorrelation as an evaluation metric for regressionTesting Model PerformanceROC curves/AUC values as a performance metricConvolution Neural Network Loss and performancePerformance Evaluation Metrics used in Training, Validation and TestingXGBoost evaluation metric unbalanced data - custom eval metricclassification performance metric for high risk medical decisionsWhat is the best performance metric used in balancing dataset using SMOTE techniqueRegression model performance with noisy dependent variableHow reliable are model performance reportings?










1












$begingroup$


I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)



0 : 91%



1 : 9%



To build model on this dataset having class imbalance, I have compared results using



1) SMOTE and



2) Assigning more weight to the minority class when applying fit



and the latter seems to be working better.



After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)



enter image description here



The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.



Here's my XGBClassifer's code :



xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)


And here's the code for model_fit :



def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))


Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.










share|improve this question









$endgroup$











  • $begingroup$
    What is the type of your data? is it images?
    $endgroup$
    – honar.cs
    Mar 22 at 17:27










  • $begingroup$
    No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
    $endgroup$
    – Shekhar Tanwar
    Mar 22 at 17:31















1












$begingroup$


I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)



0 : 91%



1 : 9%



To build model on this dataset having class imbalance, I have compared results using



1) SMOTE and



2) Assigning more weight to the minority class when applying fit



and the latter seems to be working better.



After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)



enter image description here



The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.



Here's my XGBClassifer's code :



xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)


And here's the code for model_fit :



def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))


Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.










share|improve this question









$endgroup$











  • $begingroup$
    What is the type of your data? is it images?
    $endgroup$
    – honar.cs
    Mar 22 at 17:27










  • $begingroup$
    No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
    $endgroup$
    – Shekhar Tanwar
    Mar 22 at 17:31













1












1








1





$begingroup$


I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)



0 : 91%



1 : 9%



To build model on this dataset having class imbalance, I have compared results using



1) SMOTE and



2) Assigning more weight to the minority class when applying fit



and the latter seems to be working better.



After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)



enter image description here



The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.



Here's my XGBClassifer's code :



xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)


And here's the code for model_fit :



def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))


Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.










share|improve this question









$endgroup$




I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)



0 : 91%



1 : 9%



To build model on this dataset having class imbalance, I have compared results using



1) SMOTE and



2) Assigning more weight to the minority class when applying fit



and the latter seems to be working better.



After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)



enter image description here



The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.



Here's my XGBClassifer's code :



xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)


And here's the code for model_fit :



def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])

algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))


Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.







machine-learning classification xgboost performance






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 22 at 17:12









Shekhar TanwarShekhar Tanwar

111




111











  • $begingroup$
    What is the type of your data? is it images?
    $endgroup$
    – honar.cs
    Mar 22 at 17:27










  • $begingroup$
    No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
    $endgroup$
    – Shekhar Tanwar
    Mar 22 at 17:31
















  • $begingroup$
    What is the type of your data? is it images?
    $endgroup$
    – honar.cs
    Mar 22 at 17:27










  • $begingroup$
    No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
    $endgroup$
    – Shekhar Tanwar
    Mar 22 at 17:31















$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27




$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27












$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31




$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31










0






active

oldest

votes












Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47804%2fmodel-performance-using-precision-as-evaluation-metric%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47804%2fmodel-performance-using-precision-as-evaluation-metric%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Marja Vauras Lähteet | Aiheesta muualla | NavigointivalikkoMarja Vauras Turun yliopiston tutkimusportaalissaInfobox OKSuomalaisen Tiedeakatemian varsinaiset jäsenetKasvatustieteiden tiedekunnan dekaanit ja muu johtoMarja VaurasKoulutusvienti on kestävyys- ja ketteryyslaji (2.5.2017)laajentamallaWorldCat Identities0000 0001 0855 9405n86069603utb201588738523620927

Which is better: GPT or RelGAN for text generation?2019 Community Moderator ElectionWhat is the difference between TextGAN and LM for text generation?GANs (generative adversarial networks) possible for text as well?Generator loss not decreasing- text to image synthesisChoosing a right algorithm for template-based text generationHow should I format input and output for text generation with LSTMsGumbel Softmax vs Vanilla Softmax for GAN trainingWhich neural network to choose for classification from text/speech?NLP text autoencoder that generates text in poetic meterWhat is the interpretation of the expectation notation in the GAN formulation?What is the difference between TextGAN and LM for text generation?How to prepare the data for text generation task

Is this part of the description of the Archfey warlock's Misty Escape feature redundant?When is entropic ward considered “used”?How does the reaction timing work for Wrath of the Storm? Can it potentially prevent the damage from the triggering attack?Does the Dark Arts Archlich warlock patrons's Arcane Invisibility activate every time you cast a level 1+ spell?When attacking while invisible, when exactly does invisibility break?Can I cast Hellish Rebuke on my turn?Do I have to “pre-cast” a reaction spell in order for it to be triggered?What happens if a Player Misty Escapes into an Invisible CreatureCan a reaction interrupt multiattack?Does the Fiend-patron warlock's Hurl Through Hell feature dispel effects that require the target to be on the same plane as the caster?What are you allowed to do while using the Warlock's Eldritch Master feature?