I have this data set for crimes of a 12 month time period, over 250k rows. I want to predict future crimes by date and location Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsRegression Model for explained model(Details inside)Scikit Learn: KMeans Clustering 3D data over a time period (dimentionality reduction?)Techniques for dealing with unevenly spaced time series data that have missing time-stamps?How to predict Estimated Time for Arrival given only trajectory data and time?ML algorithm to predict the state of state of a device (ON/ OFF) based on day, time and locationRetrieve user features in real time from UserId for predictionCurse of dimensionality: Data set with 1 million rows, max number of features can I have approximately?Best model for Machine LearningHow decision trees work in PythonCan we predict when an event will occur in the future from time series data?
Project Euler #1 in C++
Time to Settle Down!
Is there hard evidence that the grant peer review system performs significantly better than random?
How do I make this wiring inside cabinet safer?
How do living politicians protect their readily obtainable signatures from misuse?
Disembodied hand growing fangs
What is the effect of altitude on true airspeed?
Where are Serre’s lectures at Collège de France to be found?
When the Haste spell ends on a creature, do attackers have advantage against that creature?
Is this homebrew Lady of Pain warlock patron balanced?
Selecting user stories during sprint planning
Why is Nikon 1.4g better when Nikon 1.8g is sharper?
Can family of EU Blue Card holder travel freely in the Schengen Area with a German Aufenthaltstitel?
Can a new player join a group only when a new campaign starts?
AppleTVs create a chatty alternate WiFi network
Why are the trig functions versine, haversine, exsecant, etc, rarely used in modern mathematics?
What does this Jacques Hadamard quote mean?
What causes the direction of lightning flashes?
Did MS DOS itself ever use blinking text?
How do I use the new nonlinear finite element in Mathematica 12 for this equation?
How to tell that you are a giant?
How to convince students of the implication truth values?
QGIS: how to apply Line Pattern Fill to LineStrings?
Weight the 'randomness' of the 'pick' routine?
I have this data set for crimes of a 12 month time period, over 250k rows. I want to predict future crimes by date and location
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsRegression Model for explained model(Details inside)Scikit Learn: KMeans Clustering 3D data over a time period (dimentionality reduction?)Techniques for dealing with unevenly spaced time series data that have missing time-stamps?How to predict Estimated Time for Arrival given only trajectory data and time?ML algorithm to predict the state of state of a device (ON/ OFF) based on day, time and locationRetrieve user features in real time from UserId for predictionCurse of dimensionality: Data set with 1 million rows, max number of features can I have approximately?Best model for Machine LearningHow decision trees work in PythonCan we predict when an event will occur in the future from time series data?
$begingroup$
I have this 250k data set with these features
date_time FullAddress call_type priority lat long
0 6/14/17 21:54 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
1 3/29/17 22:24 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
2 6/3/17 18:04 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
3 3/17/17 10:57 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
4 3/3/17 23:45 10 15TH ST, San Diego, CA 911P 2.0 32.705722 -117.15035
Date and time , full address , lat and long , and call type , and level of the seriousness of the crime.
I want to predict the time when Future crimes will happen or predict the location it will happen again. How can I make that happen, will I use regression or classification? I already predicted the priority, but how can I predict the time it will happen or the location?
I predicted the priority but doesn't really give me anything. I want to predict time and location or either or.
this is some code i have for my priority prediction
from sklearn.ensemble import RandomForestClassifier
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
my_RandomForest = RandomForestClassifier(n_estimators=100, random_state=0)
my_RandomForest.fit(X_train, y_train)
y_predict_fr = my_RandomForest.predict(X_test)
from sklearn.metrics import accuracy_score
print(y_predict_fr)
accuracy_fr = accuracy_score(y_test, y_predict_fr)
print(accuracy_fr)
[4. 3. 2. ... 3. 1. 2.]
0.95100761598545
machine-learning scikit-learn predictive-modeling pandas
$endgroup$
add a comment |
$begingroup$
I have this 250k data set with these features
date_time FullAddress call_type priority lat long
0 6/14/17 21:54 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
1 3/29/17 22:24 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
2 6/3/17 18:04 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
3 3/17/17 10:57 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
4 3/3/17 23:45 10 15TH ST, San Diego, CA 911P 2.0 32.705722 -117.15035
Date and time , full address , lat and long , and call type , and level of the seriousness of the crime.
I want to predict the time when Future crimes will happen or predict the location it will happen again. How can I make that happen, will I use regression or classification? I already predicted the priority, but how can I predict the time it will happen or the location?
I predicted the priority but doesn't really give me anything. I want to predict time and location or either or.
this is some code i have for my priority prediction
from sklearn.ensemble import RandomForestClassifier
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
my_RandomForest = RandomForestClassifier(n_estimators=100, random_state=0)
my_RandomForest.fit(X_train, y_train)
y_predict_fr = my_RandomForest.predict(X_test)
from sklearn.metrics import accuracy_score
print(y_predict_fr)
accuracy_fr = accuracy_score(y_test, y_predict_fr)
print(accuracy_fr)
[4. 3. 2. ... 3. 1. 2.]
0.95100761598545
machine-learning scikit-learn predictive-modeling pandas
$endgroup$
add a comment |
$begingroup$
I have this 250k data set with these features
date_time FullAddress call_type priority lat long
0 6/14/17 21:54 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
1 3/29/17 22:24 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
2 6/3/17 18:04 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
3 3/17/17 10:57 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
4 3/3/17 23:45 10 15TH ST, San Diego, CA 911P 2.0 32.705722 -117.15035
Date and time , full address , lat and long , and call type , and level of the seriousness of the crime.
I want to predict the time when Future crimes will happen or predict the location it will happen again. How can I make that happen, will I use regression or classification? I already predicted the priority, but how can I predict the time it will happen or the location?
I predicted the priority but doesn't really give me anything. I want to predict time and location or either or.
this is some code i have for my priority prediction
from sklearn.ensemble import RandomForestClassifier
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
my_RandomForest = RandomForestClassifier(n_estimators=100, random_state=0)
my_RandomForest.fit(X_train, y_train)
y_predict_fr = my_RandomForest.predict(X_test)
from sklearn.metrics import accuracy_score
print(y_predict_fr)
accuracy_fr = accuracy_score(y_test, y_predict_fr)
print(accuracy_fr)
[4. 3. 2. ... 3. 1. 2.]
0.95100761598545
machine-learning scikit-learn predictive-modeling pandas
$endgroup$
I have this 250k data set with these features
date_time FullAddress call_type priority lat long
0 6/14/17 21:54 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
1 3/29/17 22:24 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
2 6/3/17 18:04 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
3 3/17/17 10:57 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
4 3/3/17 23:45 10 15TH ST, San Diego, CA 911P 2.0 32.705722 -117.15035
Date and time , full address , lat and long , and call type , and level of the seriousness of the crime.
I want to predict the time when Future crimes will happen or predict the location it will happen again. How can I make that happen, will I use regression or classification? I already predicted the priority, but how can I predict the time it will happen or the location?
I predicted the priority but doesn't really give me anything. I want to predict time and location or either or.
this is some code i have for my priority prediction
from sklearn.ensemble import RandomForestClassifier
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
my_RandomForest = RandomForestClassifier(n_estimators=100, random_state=0)
my_RandomForest.fit(X_train, y_train)
y_predict_fr = my_RandomForest.predict(X_test)
from sklearn.metrics import accuracy_score
print(y_predict_fr)
accuracy_fr = accuracy_score(y_test, y_predict_fr)
print(accuracy_fr)
[4. 3. 2. ... 3. 1. 2.]
0.95100761598545
machine-learning scikit-learn predictive-modeling pandas
machine-learning scikit-learn predictive-modeling pandas
edited Apr 3 at 9:06
Harman
378212
378212
asked Apr 3 at 2:28
David ArriagaDavid Arriaga
61
61
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.
Time and location are continuous variables, you will predict them with Regression.
Call_type is a discrete variable, you will predict it with Classification.
Full address is dependent on lat and long, you should probably remove this from your model.
$endgroup$
$begingroup$
It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
$endgroup$
– ukemi
Apr 3 at 14:08
$begingroup$
The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
But, yes if want to go that way it is totally valid
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48479%2fi-have-this-data-set-for-crimes-of-a-12-month-time-period-over-250k-rows-i-wan%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.
Time and location are continuous variables, you will predict them with Regression.
Call_type is a discrete variable, you will predict it with Classification.
Full address is dependent on lat and long, you should probably remove this from your model.
$endgroup$
$begingroup$
It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
$endgroup$
– ukemi
Apr 3 at 14:08
$begingroup$
The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
But, yes if want to go that way it is totally valid
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
add a comment |
$begingroup$
So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.
Time and location are continuous variables, you will predict them with Regression.
Call_type is a discrete variable, you will predict it with Classification.
Full address is dependent on lat and long, you should probably remove this from your model.
$endgroup$
$begingroup$
It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
$endgroup$
– ukemi
Apr 3 at 14:08
$begingroup$
The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
But, yes if want to go that way it is totally valid
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
add a comment |
$begingroup$
So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.
Time and location are continuous variables, you will predict them with Regression.
Call_type is a discrete variable, you will predict it with Classification.
Full address is dependent on lat and long, you should probably remove this from your model.
$endgroup$
So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.
Time and location are continuous variables, you will predict them with Regression.
Call_type is a discrete variable, you will predict it with Classification.
Full address is dependent on lat and long, you should probably remove this from your model.
answered Apr 3 at 13:32
Pedro Henrique MonfortePedro Henrique Monforte
541117
541117
$begingroup$
It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
$endgroup$
– ukemi
Apr 3 at 14:08
$begingroup$
The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
But, yes if want to go that way it is totally valid
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
add a comment |
$begingroup$
It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
$endgroup$
– ukemi
Apr 3 at 14:08
$begingroup$
The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
But, yes if want to go that way it is totally valid
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
$endgroup$
– ukemi
Apr 3 at 14:08
$begingroup$
It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
$endgroup$
– ukemi
Apr 3 at 14:08
$begingroup$
The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
But, yes if want to go that way it is totally valid
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
$begingroup$
But, yes if want to go that way it is totally valid
$endgroup$
– Pedro Henrique Monforte
Apr 3 at 14:13
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48479%2fi-have-this-data-set-for-crimes-of-a-12-month-time-period-over-250k-rows-i-wan%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown