Histogram is extremely skewed to the leftTidying Time Intervals for Plotting a Histogram in RHistogram of some values onlyPlotting different values in pandas histogram with different colorsHistogram alternatives for two sets of data combinedFinding similarity between two histogram plotsHistogram plot with plt.hist()Can the 'bin size' in a histogram be thought of as a regularity constraint?How to best visualize data when outliers lead to lack of contrasting colors for the rest of the plot?How to add numbers to the axes of a graph?How to plot an histogram from a dictionary data?

What does chmod -u do?

Does Doodling or Improvising on the Piano Have Any Benefits?

Why does the Sun have different day lengths, but not the gas giants?

Open a doc from terminal, but not by its name

Why Shazam when there is already Superman?

Fear of getting stuck on one programming language / technology that is not used in my country

Why did the EU agree to delay the Brexit deadline?

Why should universal income be universal?

15% tax on $7.5k earnings. Is that right?

Moving brute-force search to FPGA

Non-abelian cohomologies

Picking the different solutions to the time independent Schrodinger eqaution

Why does a simple loop result in ASYNC_NETWORK_IO waits?

Why is it that I can sometimes guess the next note?

How can I write humor as character trait?

I'm the sea and the sun

Pre-mixing cryogenic fuels and using only one fuel tank

Title 53, why is it reserved?

How does the math work for Perception checks?

How to cover method return statement in Apex Class?

Why can Carol Danvers change her suit colours in the first place?

What is the English pronunciation of "pain au chocolat"?

Terse Method to Swap Lowest for Highest?

What happens if you are holding an Iron Flask with a demon inside and walk into an Antimagic Field?



Histogram is extremely skewed to the left


Tidying Time Intervals for Plotting a Histogram in RHistogram of some values onlyPlotting different values in pandas histogram with different colorsHistogram alternatives for two sets of data combinedFinding similarity between two histogram plotsHistogram plot with plt.hist()Can the 'bin size' in a histogram be thought of as a regularity constraint?How to best visualize data when outliers lead to lack of contrasting colors for the rest of the plot?How to add numbers to the axes of a graph?How to plot an histogram from a dictionary data?













1












$begingroup$


everyone.
Hope you are all okay.



I am pretty bad at visualizations in Python.
I am working on a movie dataset and now I want to understand how values of variable 'vote_count' for every movie in the dataset are distributed.
So I am using very simple code:



plt.hist(df_movies.vote_count,bins=7)
plt.show()


And the output looks the following way:



enter image description here



As it is easy to notice the distribution is skewed very much to left.

I realize that I should focus on plt.xticks(), but I tried to play around and the histogram still looked awful.



Could you please give me some advices how to make it more understandable and useful to get insights.



Thanks a lot in advance!










share|improve this question









New contributor




user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$
















    1












    $begingroup$


    everyone.
    Hope you are all okay.



    I am pretty bad at visualizations in Python.
    I am working on a movie dataset and now I want to understand how values of variable 'vote_count' for every movie in the dataset are distributed.
    So I am using very simple code:



    plt.hist(df_movies.vote_count,bins=7)
    plt.show()


    And the output looks the following way:



    enter image description here



    As it is easy to notice the distribution is skewed very much to left.

    I realize that I should focus on plt.xticks(), but I tried to play around and the histogram still looked awful.



    Could you please give me some advices how to make it more understandable and useful to get insights.



    Thanks a lot in advance!










    share|improve this question









    New contributor




    user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$














      1












      1








      1





      $begingroup$


      everyone.
      Hope you are all okay.



      I am pretty bad at visualizations in Python.
      I am working on a movie dataset and now I want to understand how values of variable 'vote_count' for every movie in the dataset are distributed.
      So I am using very simple code:



      plt.hist(df_movies.vote_count,bins=7)
      plt.show()


      And the output looks the following way:



      enter image description here



      As it is easy to notice the distribution is skewed very much to left.

      I realize that I should focus on plt.xticks(), but I tried to play around and the histogram still looked awful.



      Could you please give me some advices how to make it more understandable and useful to get insights.



      Thanks a lot in advance!










      share|improve this question









      New contributor




      user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      everyone.
      Hope you are all okay.



      I am pretty bad at visualizations in Python.
      I am working on a movie dataset and now I want to understand how values of variable 'vote_count' for every movie in the dataset are distributed.
      So I am using very simple code:



      plt.hist(df_movies.vote_count,bins=7)
      plt.show()


      And the output looks the following way:



      enter image description here



      As it is easy to notice the distribution is skewed very much to left.

      I realize that I should focus on plt.xticks(), but I tried to play around and the histogram still looked awful.



      Could you please give me some advices how to make it more understandable and useful to get insights.



      Thanks a lot in advance!







      python visualization matplotlib






      share|improve this question









      New contributor




      user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited Mar 19 at 21:45







      user641597













      New contributor




      user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Mar 19 at 10:46









      user641597user641597

      83




      83




      New contributor




      user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      user641597 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          1 Answer
          1






          active

          oldest

          votes


















          2












          $begingroup$

          I think there are a few "easy wins" here.



          1. You might add more bins - you are already using the bin setting. Just add something high like 100 or even 1000 to get a first feeling for the data

          2. You can define the range of your bins. For example you could set the range with a list of two entries range = [0, 5000] as an additional parameter

          3. You can consider not showing all data- maybe movies with a few ratings or movies with a lot are not what you focus on. You can use the range argument from point two to limit the plot to the region interesting for you.

          4. You can try a cumulative and normed plot cumulative = True, normed = True this can be very useful to get a better estimate on statements like "80% of movies have less than 100 ratings" - for cumulative plots I recommend a very large number of bins.

          I hope this helps and let me know if something is unclear






          share|improve this answer









          $endgroup$












            Your Answer





            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "557"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );






            user641597 is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47604%2fhistogram-is-extremely-skewed-to-the-left%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2












            $begingroup$

            I think there are a few "easy wins" here.



            1. You might add more bins - you are already using the bin setting. Just add something high like 100 or even 1000 to get a first feeling for the data

            2. You can define the range of your bins. For example you could set the range with a list of two entries range = [0, 5000] as an additional parameter

            3. You can consider not showing all data- maybe movies with a few ratings or movies with a lot are not what you focus on. You can use the range argument from point two to limit the plot to the region interesting for you.

            4. You can try a cumulative and normed plot cumulative = True, normed = True this can be very useful to get a better estimate on statements like "80% of movies have less than 100 ratings" - for cumulative plots I recommend a very large number of bins.

            I hope this helps and let me know if something is unclear






            share|improve this answer









            $endgroup$

















              2












              $begingroup$

              I think there are a few "easy wins" here.



              1. You might add more bins - you are already using the bin setting. Just add something high like 100 or even 1000 to get a first feeling for the data

              2. You can define the range of your bins. For example you could set the range with a list of two entries range = [0, 5000] as an additional parameter

              3. You can consider not showing all data- maybe movies with a few ratings or movies with a lot are not what you focus on. You can use the range argument from point two to limit the plot to the region interesting for you.

              4. You can try a cumulative and normed plot cumulative = True, normed = True this can be very useful to get a better estimate on statements like "80% of movies have less than 100 ratings" - for cumulative plots I recommend a very large number of bins.

              I hope this helps and let me know if something is unclear






              share|improve this answer









              $endgroup$















                2












                2








                2





                $begingroup$

                I think there are a few "easy wins" here.



                1. You might add more bins - you are already using the bin setting. Just add something high like 100 or even 1000 to get a first feeling for the data

                2. You can define the range of your bins. For example you could set the range with a list of two entries range = [0, 5000] as an additional parameter

                3. You can consider not showing all data- maybe movies with a few ratings or movies with a lot are not what you focus on. You can use the range argument from point two to limit the plot to the region interesting for you.

                4. You can try a cumulative and normed plot cumulative = True, normed = True this can be very useful to get a better estimate on statements like "80% of movies have less than 100 ratings" - for cumulative plots I recommend a very large number of bins.

                I hope this helps and let me know if something is unclear






                share|improve this answer









                $endgroup$



                I think there are a few "easy wins" here.



                1. You might add more bins - you are already using the bin setting. Just add something high like 100 or even 1000 to get a first feeling for the data

                2. You can define the range of your bins. For example you could set the range with a list of two entries range = [0, 5000] as an additional parameter

                3. You can consider not showing all data- maybe movies with a few ratings or movies with a lot are not what you focus on. You can use the range argument from point two to limit the plot to the region interesting for you.

                4. You can try a cumulative and normed plot cumulative = True, normed = True this can be very useful to get a better estimate on statements like "80% of movies have less than 100 ratings" - for cumulative plots I recommend a very large number of bins.

                I hope this helps and let me know if something is unclear







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 19 at 12:44









                El BurroEl Burro

                455311




                455311




















                    user641597 is a new contributor. Be nice, and check out our Code of Conduct.









                    draft saved

                    draft discarded


















                    user641597 is a new contributor. Be nice, and check out our Code of Conduct.












                    user641597 is a new contributor. Be nice, and check out our Code of Conduct.











                    user641597 is a new contributor. Be nice, and check out our Code of Conduct.














                    Thanks for contributing an answer to Data Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47604%2fhistogram-is-extremely-skewed-to-the-left%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Marja Vauras Lähteet | Aiheesta muualla | NavigointivalikkoMarja Vauras Turun yliopiston tutkimusportaalissaInfobox OKSuomalaisen Tiedeakatemian varsinaiset jäsenetKasvatustieteiden tiedekunnan dekaanit ja muu johtoMarja VaurasKoulutusvienti on kestävyys- ja ketteryyslaji (2.5.2017)laajentamallaWorldCat Identities0000 0001 0855 9405n86069603utb201588738523620927

                    Which is better: GPT or RelGAN for text generation?2019 Community Moderator ElectionWhat is the difference between TextGAN and LM for text generation?GANs (generative adversarial networks) possible for text as well?Generator loss not decreasing- text to image synthesisChoosing a right algorithm for template-based text generationHow should I format input and output for text generation with LSTMsGumbel Softmax vs Vanilla Softmax for GAN trainingWhich neural network to choose for classification from text/speech?NLP text autoencoder that generates text in poetic meterWhat is the interpretation of the expectation notation in the GAN formulation?What is the difference between TextGAN and LM for text generation?How to prepare the data for text generation task

                    Is this part of the description of the Archfey warlock's Misty Escape feature redundant?When is entropic ward considered “used”?How does the reaction timing work for Wrath of the Storm? Can it potentially prevent the damage from the triggering attack?Does the Dark Arts Archlich warlock patrons's Arcane Invisibility activate every time you cast a level 1+ spell?When attacking while invisible, when exactly does invisibility break?Can I cast Hellish Rebuke on my turn?Do I have to “pre-cast” a reaction spell in order for it to be triggered?What happens if a Player Misty Escapes into an Invisible CreatureCan a reaction interrupt multiattack?Does the Fiend-patron warlock's Hurl Through Hell feature dispel effects that require the target to be on the same plane as the caster?What are you allowed to do while using the Warlock's Eldritch Master feature?