Linear regression load model doesn't predict as expectedNLTK: Tuning LinearSVC classifier accuracy? - Looking for better approaches/advicesSKNN regression problemValueError while using linear regressionTips to improve Linear Regression modelSimple Linear Regression-----How to make my model more efficient??Orange Linear Regression and scikit-learn linear regression gives different resultsLinear Model for Linear RegressionPredict the accuracy of Linear RegressionEvaluation of linear regression modelLinear regression model with (categorical) predictor variables

Do people actually use the word "kaputt" in conversation?

Was World War I a war of liberals against authoritarians?

Should I be concerned about student access to a test bank?

Unfrosted light bulb

What happens when the centripetal force is equal and opposite to the centrifugal force?

What is the tangent at a sharp point on a curve?

What do the positive and negative (+/-) transmit and receive pins mean on Ethernet cables?

When should a starting writer get his own webpage?

How are passwords stolen from companies if they only store hashes?

How do researchers send unsolicited emails asking for feedback on their works?

Why is "la Gestapo" feminine?

Hot air balloons as primitive bombers

What should be the ideal length of sentences in a blog post for ease of reading?

Turning a hard to access nut?

Is this Pascal's Matrix?

When doing an engine swap , do you have to have a matching ecu

If the Dominion rule using their Jem'Hadar troops, why is their life expectancy so low?

Naïve RSA decryption in Python

Reasons for having MCU pin-states default to pull-up/down out of reset

When did hardware antialiasing start being available?

Animal R'aim of the midrash

Have any astronauts/cosmonauts died in space?

Why didn't Voldemort know what Grindelwald looked like?

categorizing a variable turns it from insignificant to significant



Linear regression load model doesn't predict as expected


NLTK: Tuning LinearSVC classifier accuracy? - Looking for better approaches/advicesSKNN regression problemValueError while using linear regressionTips to improve Linear Regression modelSimple Linear Regression-----How to make my model more efficient??Orange Linear Regression and scikit-learn linear regression gives different resultsLinear Model for Linear RegressionPredict the accuracy of Linear RegressionEvaluation of linear regression modelLinear regression model with (categorical) predictor variables













1












$begingroup$


I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and try to predict a new review.



There is something very strange about this prediction: whatever the input it always predicts around 2.1-3.0.



Thing is, I have a suggestion that it predicts around the average of 5 (which is 2.5 +/-) but this is not the case. I have printed when training the model the prediction value and the actual value of the test data and they range normally 1-5. So my idea is, that there is something wrong with the loading part of the code (or even the reshape of the new vector). This is my load code:



from gensim.models.doc2vec import Doc2Vec, TaggedDocument
from bs4 import BeautifulSoup
from joblib import dump, load
import pickle
import re

model = Doc2Vec.load('../vectors/750000/doc2vec_model')

def cleanText(text):
text = BeautifulSoup(text, "lxml").text
text = re.sub(r'|||', r' ', text)
text = re.sub(r'httpS+', r'<URL>', text)
text = re.sub(r'[^ws]','',text)
text = text.lower()
text = text.replace('x', '')
return text

review = cleanText("Horrible movie! I don't recommend it to anyone!").split()
vector = model.infer_vector(review)

pkl_filename = "../vectors/750000/linear_regression_model.joblib"
with open(pkl_filename, 'rb') as file:
linreg = pickle.load(file)

review_vector = vector.reshape(1,-1)
predict_star = linreg.predict(review_vector)
print(predict_star)









share|improve this question









New contributor




Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$
















    1












    $begingroup$


    I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and try to predict a new review.



    There is something very strange about this prediction: whatever the input it always predicts around 2.1-3.0.



    Thing is, I have a suggestion that it predicts around the average of 5 (which is 2.5 +/-) but this is not the case. I have printed when training the model the prediction value and the actual value of the test data and they range normally 1-5. So my idea is, that there is something wrong with the loading part of the code (or even the reshape of the new vector). This is my load code:



    from gensim.models.doc2vec import Doc2Vec, TaggedDocument
    from bs4 import BeautifulSoup
    from joblib import dump, load
    import pickle
    import re

    model = Doc2Vec.load('../vectors/750000/doc2vec_model')

    def cleanText(text):
    text = BeautifulSoup(text, "lxml").text
    text = re.sub(r'|||', r' ', text)
    text = re.sub(r'httpS+', r'<URL>', text)
    text = re.sub(r'[^ws]','',text)
    text = text.lower()
    text = text.replace('x', '')
    return text

    review = cleanText("Horrible movie! I don't recommend it to anyone!").split()
    vector = model.infer_vector(review)

    pkl_filename = "../vectors/750000/linear_regression_model.joblib"
    with open(pkl_filename, 'rb') as file:
    linreg = pickle.load(file)

    review_vector = vector.reshape(1,-1)
    predict_star = linreg.predict(review_vector)
    print(predict_star)









    share|improve this question









    New contributor




    Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$














      1












      1








      1





      $begingroup$


      I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and try to predict a new review.



      There is something very strange about this prediction: whatever the input it always predicts around 2.1-3.0.



      Thing is, I have a suggestion that it predicts around the average of 5 (which is 2.5 +/-) but this is not the case. I have printed when training the model the prediction value and the actual value of the test data and they range normally 1-5. So my idea is, that there is something wrong with the loading part of the code (or even the reshape of the new vector). This is my load code:



      from gensim.models.doc2vec import Doc2Vec, TaggedDocument
      from bs4 import BeautifulSoup
      from joblib import dump, load
      import pickle
      import re

      model = Doc2Vec.load('../vectors/750000/doc2vec_model')

      def cleanText(text):
      text = BeautifulSoup(text, "lxml").text
      text = re.sub(r'|||', r' ', text)
      text = re.sub(r'httpS+', r'<URL>', text)
      text = re.sub(r'[^ws]','',text)
      text = text.lower()
      text = text.replace('x', '')
      return text

      review = cleanText("Horrible movie! I don't recommend it to anyone!").split()
      vector = model.infer_vector(review)

      pkl_filename = "../vectors/750000/linear_regression_model.joblib"
      with open(pkl_filename, 'rb') as file:
      linreg = pickle.load(file)

      review_vector = vector.reshape(1,-1)
      predict_star = linreg.predict(review_vector)
      print(predict_star)









      share|improve this question









      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and try to predict a new review.



      There is something very strange about this prediction: whatever the input it always predicts around 2.1-3.0.



      Thing is, I have a suggestion that it predicts around the average of 5 (which is 2.5 +/-) but this is not the case. I have printed when training the model the prediction value and the actual value of the test data and they range normally 1-5. So my idea is, that there is something wrong with the loading part of the code (or even the reshape of the new vector). This is my load code:



      from gensim.models.doc2vec import Doc2Vec, TaggedDocument
      from bs4 import BeautifulSoup
      from joblib import dump, load
      import pickle
      import re

      model = Doc2Vec.load('../vectors/750000/doc2vec_model')

      def cleanText(text):
      text = BeautifulSoup(text, "lxml").text
      text = re.sub(r'|||', r' ', text)
      text = re.sub(r'httpS+', r'<URL>', text)
      text = re.sub(r'[^ws]','',text)
      text = text.lower()
      text = text.replace('x', '')
      return text

      review = cleanText("Horrible movie! I don't recommend it to anyone!").split()
      vector = model.infer_vector(review)

      pkl_filename = "../vectors/750000/linear_regression_model.joblib"
      with open(pkl_filename, 'rb') as file:
      linreg = pickle.load(file)

      review_vector = vector.reshape(1,-1)
      predict_star = linreg.predict(review_vector)
      print(predict_star)






      machine-learning python scikit-learn linear-regression word-embeddings






      share|improve this question









      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited yesterday







      Marilou













      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked yesterday









      MarilouMarilou

      62




      62




      New contributor




      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Marilou is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          0






          active

          oldest

          votes











          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          Marilou is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47478%2flinear-regression-load-model-doesnt-predict-as-expected%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          Marilou is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          Marilou is a new contributor. Be nice, and check out our Code of Conduct.












          Marilou is a new contributor. Be nice, and check out our Code of Conduct.











          Marilou is a new contributor. Be nice, and check out our Code of Conduct.














          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47478%2flinear-regression-load-model-doesnt-predict-as-expected%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Marja Vauras Lähteet | Aiheesta muualla | NavigointivalikkoMarja Vauras Turun yliopiston tutkimusportaalissaInfobox OKSuomalaisen Tiedeakatemian varsinaiset jäsenetKasvatustieteiden tiedekunnan dekaanit ja muu johtoMarja VaurasKoulutusvienti on kestävyys- ja ketteryyslaji (2.5.2017)laajentamallaWorldCat Identities0000 0001 0855 9405n86069603utb201588738523620927

          Which is better: GPT or RelGAN for text generation?2019 Community Moderator ElectionWhat is the difference between TextGAN and LM for text generation?GANs (generative adversarial networks) possible for text as well?Generator loss not decreasing- text to image synthesisChoosing a right algorithm for template-based text generationHow should I format input and output for text generation with LSTMsGumbel Softmax vs Vanilla Softmax for GAN trainingWhich neural network to choose for classification from text/speech?NLP text autoencoder that generates text in poetic meterWhat is the interpretation of the expectation notation in the GAN formulation?What is the difference between TextGAN and LM for text generation?How to prepare the data for text generation task

          Is this part of the description of the Archfey warlock's Misty Escape feature redundant?When is entropic ward considered “used”?How does the reaction timing work for Wrath of the Storm? Can it potentially prevent the damage from the triggering attack?Does the Dark Arts Archlich warlock patrons's Arcane Invisibility activate every time you cast a level 1+ spell?When attacking while invisible, when exactly does invisibility break?Can I cast Hellish Rebuke on my turn?Do I have to “pre-cast” a reaction spell in order for it to be triggered?What happens if a Player Misty Escapes into an Invisible CreatureCan a reaction interrupt multiattack?Does the Fiend-patron warlock's Hurl Through Hell feature dispel effects that require the target to be on the same plane as the caster?What are you allowed to do while using the Warlock's Eldritch Master feature?