I have this data set for crimes of a 12 month time period, over 250k rows. I want to predict future crimes by date and location Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsRegression Model for explained model(Details inside)Scikit Learn: KMeans Clustering 3D data over a time period (dimentionality reduction?)Techniques for dealing with unevenly spaced time series data that have missing time-stamps?How to predict Estimated Time for Arrival given only trajectory data and time?ML algorithm to predict the state of state of a device (ON/ OFF) based on day, time and locationRetrieve user features in real time from UserId for predictionCurse of dimensionality: Data set with 1 million rows, max number of features can I have approximately?Best model for Machine LearningHow decision trees work in PythonCan we predict when an event will occur in the future from time series data?

Project Euler #1 in C++

Time to Settle Down!

Is there hard evidence that the grant peer review system performs significantly better than random?

How do I make this wiring inside cabinet safer?

How do living politicians protect their readily obtainable signatures from misuse?

Disembodied hand growing fangs

What is the effect of altitude on true airspeed?

Where are Serre’s lectures at Collège de France to be found?

When the Haste spell ends on a creature, do attackers have advantage against that creature?

Is this homebrew Lady of Pain warlock patron balanced?

Selecting user stories during sprint planning

Why is Nikon 1.4g better when Nikon 1.8g is sharper?

Can family of EU Blue Card holder travel freely in the Schengen Area with a German Aufenthaltstitel?

Can a new player join a group only when a new campaign starts?

AppleTVs create a chatty alternate WiFi network

Why are the trig functions versine, haversine, exsecant, etc, rarely used in modern mathematics?

What does this Jacques Hadamard quote mean?

What causes the direction of lightning flashes?

Did MS DOS itself ever use blinking text?

How do I use the new nonlinear finite element in Mathematica 12 for this equation?

How to tell that you are a giant?

How to convince students of the implication truth values?

QGIS: how to apply Line Pattern Fill to LineStrings?

Weight the 'randomness' of the 'pick' routine?



I have this data set for crimes of a 12 month time period, over 250k rows. I want to predict future crimes by date and location



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsRegression Model for explained model(Details inside)Scikit Learn: KMeans Clustering 3D data over a time period (dimentionality reduction?)Techniques for dealing with unevenly spaced time series data that have missing time-stamps?How to predict Estimated Time for Arrival given only trajectory data and time?ML algorithm to predict the state of state of a device (ON/ OFF) based on day, time and locationRetrieve user features in real time from UserId for predictionCurse of dimensionality: Data set with 1 million rows, max number of features can I have approximately?Best model for Machine LearningHow decision trees work in PythonCan we predict when an event will occur in the future from time series data?










1












$begingroup$


I have this 250k data set with these features



 date_time FullAddress call_type priority lat long
0 6/14/17 21:54 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
1 3/29/17 22:24 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
2 6/3/17 18:04 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
3 3/17/17 10:57 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
4 3/3/17 23:45 10 15TH ST, San Diego, CA 911P 2.0 32.705722 -117.15035


Date and time , full address , lat and long , and call type , and level of the seriousness of the crime.
I want to predict the time when Future crimes will happen or predict the location it will happen again. How can I make that happen, will I use regression or classification? I already predicted the priority, but how can I predict the time it will happen or the location?



I predicted the priority but doesn't really give me anything. I want to predict time and location or either or.



this is some code i have for my priority prediction



from sklearn.ensemble import RandomForestClassifier
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
my_RandomForest = RandomForestClassifier(n_estimators=100, random_state=0)

my_RandomForest.fit(X_train, y_train)
y_predict_fr = my_RandomForest.predict(X_test)
from sklearn.metrics import accuracy_score
print(y_predict_fr)
accuracy_fr = accuracy_score(y_test, y_predict_fr)
print(accuracy_fr)

[4. 3. 2. ... 3. 1. 2.]
0.95100761598545









share|improve this question











$endgroup$
















    1












    $begingroup$


    I have this 250k data set with these features



     date_time FullAddress call_type priority lat long
    0 6/14/17 21:54 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
    1 3/29/17 22:24 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
    2 6/3/17 18:04 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
    3 3/17/17 10:57 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
    4 3/3/17 23:45 10 15TH ST, San Diego, CA 911P 2.0 32.705722 -117.15035


    Date and time , full address , lat and long , and call type , and level of the seriousness of the crime.
    I want to predict the time when Future crimes will happen or predict the location it will happen again. How can I make that happen, will I use regression or classification? I already predicted the priority, but how can I predict the time it will happen or the location?



    I predicted the priority but doesn't really give me anything. I want to predict time and location or either or.



    this is some code i have for my priority prediction



    from sklearn.ensemble import RandomForestClassifier
    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
    my_RandomForest = RandomForestClassifier(n_estimators=100, random_state=0)

    my_RandomForest.fit(X_train, y_train)
    y_predict_fr = my_RandomForest.predict(X_test)
    from sklearn.metrics import accuracy_score
    print(y_predict_fr)
    accuracy_fr = accuracy_score(y_test, y_predict_fr)
    print(accuracy_fr)

    [4. 3. 2. ... 3. 1. 2.]
    0.95100761598545









    share|improve this question











    $endgroup$














      1












      1








      1





      $begingroup$


      I have this 250k data set with these features



       date_time FullAddress call_type priority lat long
      0 6/14/17 21:54 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
      1 3/29/17 22:24 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
      2 6/3/17 18:04 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
      3 3/17/17 10:57 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
      4 3/3/17 23:45 10 15TH ST, San Diego, CA 911P 2.0 32.705722 -117.15035


      Date and time , full address , lat and long , and call type , and level of the seriousness of the crime.
      I want to predict the time when Future crimes will happen or predict the location it will happen again. How can I make that happen, will I use regression or classification? I already predicted the priority, but how can I predict the time it will happen or the location?



      I predicted the priority but doesn't really give me anything. I want to predict time and location or either or.



      this is some code i have for my priority prediction



      from sklearn.ensemble import RandomForestClassifier
      X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
      my_RandomForest = RandomForestClassifier(n_estimators=100, random_state=0)

      my_RandomForest.fit(X_train, y_train)
      y_predict_fr = my_RandomForest.predict(X_test)
      from sklearn.metrics import accuracy_score
      print(y_predict_fr)
      accuracy_fr = accuracy_score(y_test, y_predict_fr)
      print(accuracy_fr)

      [4. 3. 2. ... 3. 1. 2.]
      0.95100761598545









      share|improve this question











      $endgroup$




      I have this 250k data set with these features



       date_time FullAddress call_type priority lat long
      0 6/14/17 21:54 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
      1 3/29/17 22:24 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
      2 6/3/17 18:04 10 14TH ST, San Diego, CA 1016 2.0 32.705449 -117.151870
      3 3/17/17 10:57 10 14TH ST, San Diego, CA 1151 2.0 32.705449 -117.151870
      4 3/3/17 23:45 10 15TH ST, San Diego, CA 911P 2.0 32.705722 -117.15035


      Date and time , full address , lat and long , and call type , and level of the seriousness of the crime.
      I want to predict the time when Future crimes will happen or predict the location it will happen again. How can I make that happen, will I use regression or classification? I already predicted the priority, but how can I predict the time it will happen or the location?



      I predicted the priority but doesn't really give me anything. I want to predict time and location or either or.



      this is some code i have for my priority prediction



      from sklearn.ensemble import RandomForestClassifier
      X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
      my_RandomForest = RandomForestClassifier(n_estimators=100, random_state=0)

      my_RandomForest.fit(X_train, y_train)
      y_predict_fr = my_RandomForest.predict(X_test)
      from sklearn.metrics import accuracy_score
      print(y_predict_fr)
      accuracy_fr = accuracy_score(y_test, y_predict_fr)
      print(accuracy_fr)

      [4. 3. 2. ... 3. 1. 2.]
      0.95100761598545






      machine-learning scikit-learn predictive-modeling pandas






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Apr 3 at 9:06









      Harman

      378212




      378212










      asked Apr 3 at 2:28









      David ArriagaDavid Arriaga

      61




      61




















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.



          • Time and location are continuous variables, you will predict them with Regression.


          • Call_type is a discrete variable, you will predict it with Classification.


          • Full address is dependent on lat and long, you should probably remove this from your model.






          share|improve this answer









          $endgroup$












          • $begingroup$
            It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
            $endgroup$
            – ukemi
            Apr 3 at 14:08










          • $begingroup$
            The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13










          • $begingroup$
            But, yes if want to go that way it is totally valid
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13











          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48479%2fi-have-this-data-set-for-crimes-of-a-12-month-time-period-over-250k-rows-i-wan%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0












          $begingroup$

          So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.



          • Time and location are continuous variables, you will predict them with Regression.


          • Call_type is a discrete variable, you will predict it with Classification.


          • Full address is dependent on lat and long, you should probably remove this from your model.






          share|improve this answer









          $endgroup$












          • $begingroup$
            It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
            $endgroup$
            – ukemi
            Apr 3 at 14:08










          • $begingroup$
            The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13










          • $begingroup$
            But, yes if want to go that way it is totally valid
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13















          0












          $begingroup$

          So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.



          • Time and location are continuous variables, you will predict them with Regression.


          • Call_type is a discrete variable, you will predict it with Classification.


          • Full address is dependent on lat and long, you should probably remove this from your model.






          share|improve this answer









          $endgroup$












          • $begingroup$
            It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
            $endgroup$
            – ukemi
            Apr 3 at 14:08










          • $begingroup$
            The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13










          • $begingroup$
            But, yes if want to go that way it is totally valid
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13













          0












          0








          0





          $begingroup$

          So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.



          • Time and location are continuous variables, you will predict them with Regression.


          • Call_type is a discrete variable, you will predict it with Classification.


          • Full address is dependent on lat and long, you should probably remove this from your model.






          share|improve this answer









          $endgroup$



          So, remember the uses of Regression and Classification. Basicaly if you want o predict a discrete variable you should use classification, if you want to predict a continuous variable you should use regression. You can also use Regression for discrete variables if you quantize it after using some threshoulding method or something like that.



          • Time and location are continuous variables, you will predict them with Regression.


          • Call_type is a discrete variable, you will predict it with Classification.


          • Full address is dependent on lat and long, you should probably remove this from your model.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Apr 3 at 13:32









          Pedro Henrique MonfortePedro Henrique Monforte

          541117




          541117











          • $begingroup$
            It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
            $endgroup$
            – ukemi
            Apr 3 at 14:08










          • $begingroup$
            The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13










          • $begingroup$
            But, yes if want to go that way it is totally valid
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13
















          • $begingroup$
            It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
            $endgroup$
            – ukemi
            Apr 3 at 14:08










          • $begingroup$
            The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13










          • $begingroup$
            But, yes if want to go that way it is totally valid
            $endgroup$
            – Pedro Henrique Monforte
            Apr 3 at 14:13















          $begingroup$
          It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
          $endgroup$
          – ukemi
          Apr 3 at 14:08




          $begingroup$
          It might make sense for OP to bin location into regions (streets, districts etc) in which case the transformed feature would fall under classification.
          $endgroup$
          – ukemi
          Apr 3 at 14:08












          $begingroup$
          The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
          $endgroup$
          – Pedro Henrique Monforte
          Apr 3 at 14:13




          $begingroup$
          The problem in that is that it will generate too many classes and lat and long can be mapped back into regions
          $endgroup$
          – Pedro Henrique Monforte
          Apr 3 at 14:13












          $begingroup$
          But, yes if want to go that way it is totally valid
          $endgroup$
          – Pedro Henrique Monforte
          Apr 3 at 14:13




          $begingroup$
          But, yes if want to go that way it is totally valid
          $endgroup$
          – Pedro Henrique Monforte
          Apr 3 at 14:13

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48479%2fi-have-this-data-set-for-crimes-of-a-12-month-time-period-over-250k-rows-i-wan%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

          Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

          Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High