Model Performance using Precision as evaluation metric The Next CEO of Stack Overflow2019 Community Moderator ElectionCorrelation as an evaluation metric for regressionTesting Model PerformanceROC curves/AUC values as a performance metricConvolution Neural Network Loss and performancePerformance Evaluation Metrics used in Training, Validation and TestingXGBoost evaluation metric unbalanced data - custom eval metricclassification performance metric for high risk medical decisionsWhat is the best performance metric used in balancing dataset using SMOTE techniqueRegression model performance with noisy dependent variableHow reliable are model performance reportings?
Is HostGator storing my password in plaintext?
At which OSI layer a user-generated data resides?
Extending anchors in TikZ
How do I construct this japanese bowl?
Hindi speaking tourist to UK from India
WOW air has ceased operation, can I get my tickets refunded?
How did people program for Consoles with multiple CPUs?
Can the Reverse Gravity spell affect the Meteor Swarm spell?
Would a galaxy be visible from outside, but nearby?
Why didn't Theresa May consult with Parliament before negotiating a deal with the EU?
What is the difference between Sanyaas and Vairagya?
Monthly twice production release for my software project
Clustering points and summing up attributes per cluster in QGIS
Extracting names from filename in bash
Complex fractions
Is it my responsibility to learn a new technology in my own time my employer wants to implement?
Why does standard notation not preserve intervals (visually)
How to make a software documentation "officially" citable?
Anatomically Correct Strange Women In Ponds Distributing Swords
Can a caster that cast Polymorph on themselves stop concentrating at any point even if their Int is low?
How to safely derail a train during transit?
How to write papers efficiently when English isn't my first language?
Inappropriate reference requests from Journal reviewers
Only print output after finding pattern
Model Performance using Precision as evaluation metric
The Next CEO of Stack Overflow2019 Community Moderator ElectionCorrelation as an evaluation metric for regressionTesting Model PerformanceROC curves/AUC values as a performance metricConvolution Neural Network Loss and performancePerformance Evaluation Metrics used in Training, Validation and TestingXGBoost evaluation metric unbalanced data - custom eval metricclassification performance metric for high risk medical decisionsWhat is the best performance metric used in balancing dataset using SMOTE techniqueRegression model performance with noisy dependent variableHow reliable are model performance reportings?
$begingroup$
I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)
0 : 91%
1 : 9%
To build model on this dataset having class imbalance, I have compared results using
1) SMOTE and
2) Assigning more weight to the minority class when applying fit
and the latter seems to be working better.
After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)
The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.
Here's my XGBClassifer's code :
xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)
And here's the code for model_fit :
def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])
algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))
Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.
machine-learning classification xgboost performance
$endgroup$
add a comment |
$begingroup$
I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)
0 : 91%
1 : 9%
To build model on this dataset having class imbalance, I have compared results using
1) SMOTE and
2) Assigning more weight to the minority class when applying fit
and the latter seems to be working better.
After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)
The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.
Here's my XGBClassifer's code :
xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)
And here's the code for model_fit :
def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])
algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))
Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.
machine-learning classification xgboost performance
$endgroup$
$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27
$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31
add a comment |
$begingroup$
I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)
0 : 91%
1 : 9%
To build model on this dataset having class imbalance, I have compared results using
1) SMOTE and
2) Assigning more weight to the minority class when applying fit
and the latter seems to be working better.
After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)
The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.
Here's my XGBClassifer's code :
xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)
And here's the code for model_fit :
def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])
algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))
Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.
machine-learning classification xgboost performance
$endgroup$
I am dealing with an imbalanced class with the following distribution :
(Total dataset size : 10763 X 20)
0 : 91%
1 : 9%
To build model on this dataset having class imbalance, I have compared results using
1) SMOTE and
2) Assigning more weight to the minority class when applying fit
and the latter seems to be working better.
After experimenting with Decision Tree, LR, RF SVM(poly and rbf) I am now using XGBoost classifier which gives me the below classification results(these are the best numbers I've got so far)
The business problem I'm trying to solve requires the model to have high precision, as the cost associated with that is high.
Here's my XGBClassifer's code :
xgb3 = XGBClassifier(
learning_rate =0.01,
n_estimators=2000,
max_depth=15,
min_child_weight=6,
gamma=0.4,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.005,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=10,
eval_metrics = 'logloss',
seed=27)
model_fit(xgb3,X_train,y_train,X_test,y_test)
And here's the code for model_fit :
def model_fit(algorithm,X_train,y_train,X_test,y_test,cv_folds = 5 ,useTrainCV = "True", early_stopping_rounds = 50):
if useTrainCV:
xgb_param = algorithm.get_xgb_params()
xgbtrain = xgb.DMatrix(X_train,label = y_train)
xgbtest = xgb.DMatrix(X_test)
cvresult = xgb.cv(xgb_param,xgbtrain,num_boost_round = algorithm.get_params()['n_estimators'],
nfold = cv_folds,metrics='logloss', early_stopping_rounds=early_stopping_rounds)
algorithm.set_params(n_estimators=cvresult.shape[0])
algorithm.fit(X_train, y_train, eval_metric='logloss')
y_pred = algorithm.predict(X_test)
cm = confusion_matrix(y_test,y_pred)
print(cm)
print(classification_report(y_test,y_pred))
Can anyone tell me how can I increase the precision of the model. I've tried everything I know of. I'll really appreciate any help here.
machine-learning classification xgboost performance
machine-learning classification xgboost performance
asked Mar 22 at 17:12
Shekhar TanwarShekhar Tanwar
111
111
$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27
$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31
add a comment |
$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27
$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31
$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27
$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27
$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31
$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47804%2fmodel-performance-using-precision-as-evaluation-metric%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47804%2fmodel-performance-using-precision-as-evaluation-metric%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
What is the type of your data? is it images?
$endgroup$
– honar.cs
Mar 22 at 17:27
$begingroup$
No it is mostly numerical Healthcare data with attributes like age, gender, smoke (yes/no) and the tests a person has taken. All the pre processing steps have been performed.
$endgroup$
– Shekhar Tanwar
Mar 22 at 17:31