Linear regression load model doesn't predict as expectedNLTK: Tuning LinearSVC classifier accuracy? - Looking for better approaches/advicesSKNN regression problemValueError while using linear regressionTips to improve Linear Regression modelSimple Linear Regression-----How to make my model more efficient??Orange Linear Regression and scikit-learn linear regression gives different resultsLinear Model for Linear RegressionPredict the accuracy of Linear RegressionEvaluation of linear regression modelLinear regression model with (categorical) predictor variables

Do people actually use the word "kaputt" in conversation?

Was World War I a war of liberals against authoritarians?

Should I be concerned about student access to a test bank?

Unfrosted light bulb

What happens when the centripetal force is equal and opposite to the centrifugal force?

What is the tangent at a sharp point on a curve?

What do the positive and negative (+/-) transmit and receive pins mean on Ethernet cables?

When should a starting writer get his own webpage?

How are passwords stolen from companies if they only store hashes?

How do researchers send unsolicited emails asking for feedback on their works?

Why is "la Gestapo" feminine?

Hot air balloons as primitive bombers

What should be the ideal length of sentences in a blog post for ease of reading?

Turning a hard to access nut?

Is this Pascal's Matrix?

When doing an engine swap , do you have to have a matching ecu

If the Dominion rule using their Jem'Hadar troops, why is their life expectancy so low?

Naïve RSA decryption in Python

Reasons for having MCU pin-states default to pull-up/down out of reset

When did hardware antialiasing start being available?

Animal R'aim of the midrash

Have any astronauts/cosmonauts died in space?

Why didn't Voldemort know what Grindelwald looked like?

categorizing a variable turns it from insignificant to significant



Linear regression load model doesn't predict as expected


NLTK: Tuning LinearSVC classifier accuracy? - Looking for better approaches/advicesSKNN regression problemValueError while using linear regressionTips to improve Linear Regression modelSimple Linear Regression-----How to make my model more efficient??Orange Linear Regression and scikit-learn linear regression gives different resultsLinear Model for Linear RegressionPredict the accuracy of Linear RegressionEvaluation of linear regression modelLinear regression model with (categorical) predictor variables













1












$begingroup$


I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and try to predict a new review.



There is something very strange about this prediction: whatever the input it always predicts around 2.1-3.0.



Thing is, I have a suggestion that it predicts around the average of 5 (which is 2.5 +/-) but this is not the case. I have printed when training the model the prediction value and the actual value of the test data and they range normally 1-5. So my idea is, that there is something wrong with the loading part of the code (or even the reshape of the new vector). This is my load code:



from gensim.models.doc2vec import Doc2Vec, TaggedDocument
from bs4 import BeautifulSoup
from joblib import dump, load
import pickle
import re

model = Doc2Vec.load('../vectors/750000/doc2vec_model')

def cleanText(text):
text = BeautifulSoup(text, "lxml").text
text = re.sub(r'|||', r' ', text)
text = re.sub(r'httpS+', r'<URL>', text)
text = re.sub(r'[^ws]','',text)
text = text.lower()
text = text.replace('x', '')
return text

review = cleanText("Horrible movie! I don't recommend it to anyone!").split()
vector = model.infer_vector(review)

pkl_filename = "../vectors/750000/linear_regression_model.joblib"
with open(pkl_filename, 'rb') as file:
linreg = pickle.load(file)

review_vector = vector.reshape(1,-1)
predict_star = linreg.predict(review_vector)
print(predict_star)









share|improve this question









New contributor




Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$
















    1












    $begingroup$


    I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and try to predict a new review.



    There is something very strange about this prediction: whatever the input it always predicts around 2.1-3.0.



    Thing is, I have a suggestion that it predicts around the average of 5 (which is 2.5 +/-) but this is not the case. I have printed when training the model the prediction value and the actual value of the test data and they range normally 1-5. So my idea is, that there is something wrong with the loading part of the code (or even the reshape of the new vector). This is my load code:



    from gensim.models.doc2vec import Doc2Vec, TaggedDocument
    from bs4 import BeautifulSoup
    from joblib import dump, load
    import pickle
    import re

    model = Doc2Vec.load('../vectors/750000/doc2vec_model')

    def cleanText(text):
    text = BeautifulSoup(text, "lxml").text
    text = re.sub(r'|||', r' ', text)
    text = re.sub(r'httpS+', r'<URL>', text)
    text = re.sub(r'[^ws]','',text)
    text = text.lower()
    text = text.replace('x', '')
    return text

    review = cleanText("Horrible movie! I don't recommend it to anyone!").split()
    vector = model.infer_vector(review)

    pkl_filename = "../vectors/750000/linear_regression_model.joblib"
    with open(pkl_filename, 'rb') as file:
    linreg = pickle.load(file)

    review_vector = vector.reshape(1,-1)
    predict_star = linreg.predict(review_vector)
    print(predict_star)









    share|improve this question









    New contributor




    Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$














      1












      1








      1





      $begingroup$


      I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and try to predict a new review.



      There is something very strange about this prediction: whatever the input it always predicts around 2.1-3.0.



      Thing is, I have a suggestion that it predicts around the average of 5 (which is 2.5 +/-) but this is not the case. I have printed when training the model the prediction value and the actual value of the test data and they range normally 1-5. So my idea is, that there is something wrong with the loading part of the code (or even the reshape of the new vector). This is my load code:



      from gensim.models.doc2vec import Doc2Vec, TaggedDocument
      from bs4 import BeautifulSoup
      from joblib import dump, load
      import pickle
      import re

      model = Doc2Vec.load('../vectors/750000/doc2vec_model')

      def cleanText(text):
      text = BeautifulSoup(text, "lxml").text
      text = re.sub(r'|||', r' ', text)
      text = re.sub(r'httpS+', r'<URL>', text)
      text = re.sub(r'[^ws]','',text)
      text = text.lower()
      text = text.replace('x', '')
      return text

      review = cleanText("Horrible movie! I don't recommend it to anyone!").split()
      vector = model.infer_vector(review)

      pkl_filename = "../vectors/750000/linear_regression_model.joblib"
      with open(pkl_filename, 'rb') as file:
      linreg = pickle.load(file)

      review_vector = vector.reshape(1,-1)
      predict_star = linreg.predict(review_vector)
      print(predict_star)









      share|improve this question









      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and try to predict a new review.



      There is something very strange about this prediction: whatever the input it always predicts around 2.1-3.0.



      Thing is, I have a suggestion that it predicts around the average of 5 (which is 2.5 +/-) but this is not the case. I have printed when training the model the prediction value and the actual value of the test data and they range normally 1-5. So my idea is, that there is something wrong with the loading part of the code (or even the reshape of the new vector). This is my load code:



      from gensim.models.doc2vec import Doc2Vec, TaggedDocument
      from bs4 import BeautifulSoup
      from joblib import dump, load
      import pickle
      import re

      model = Doc2Vec.load('../vectors/750000/doc2vec_model')

      def cleanText(text):
      text = BeautifulSoup(text, "lxml").text
      text = re.sub(r'|||', r' ', text)
      text = re.sub(r'httpS+', r'<URL>', text)
      text = re.sub(r'[^ws]','',text)
      text = text.lower()
      text = text.replace('x', '')
      return text

      review = cleanText("Horrible movie! I don't recommend it to anyone!").split()
      vector = model.infer_vector(review)

      pkl_filename = "../vectors/750000/linear_regression_model.joblib"
      with open(pkl_filename, 'rb') as file:
      linreg = pickle.load(file)

      review_vector = vector.reshape(1,-1)
      predict_star = linreg.predict(review_vector)
      print(predict_star)






      machine-learning python scikit-learn linear-regression word-embeddings






      share|improve this question









      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited yesterday







      Marilou













      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked yesterday









      MarilouMarilou

      62




      62




      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          0






          active

          oldest

          votes











          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          Marilou is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47478%2flinear-regression-load-model-doesnt-predict-as-expected%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          Marilou is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          Marilou is a new contributor. Be nice, and check out our Code of Conduct.












          Marilou is a new contributor. Be nice, and check out our Code of Conduct.











          Marilou is a new contributor. Be nice, and check out our Code of Conduct.














          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47478%2flinear-regression-load-model-doesnt-predict-as-expected%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

          Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

          Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High