How to analyse data after applying pandas' groupby function? The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceHow to flush output of print function?How to make a chain of function decorators?Converting a Pandas GroupBy object to DataFrame“Large data” work flows using pandasChange data type of columns in PandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Python How to find average of columns using dataframes apply methodPandas groupby().apply() - returning None from the applied function messes up the resultsShaping the input for LSTM model using Keras Python

Is this wall load bearing? Blueprints and photos attached

Does the AirPods case need to be around while listening via an iOS Device?

Road tyres vs "Street" tyres for charity ride on MTB Tandem

RT6224D-based step down circuit yields 0V - why?

how can a perfect fourth interval be considered either consonant or dissonant?

How do I add random spotting to the same face in cycles?

Scientific Reports - Significant Figures

Does Parliament hold absolute power in the UK?

Finding the path in a graph from A to B then back to A with a minimum of shared edges

Change bounding box of math glyphs in LuaTeX

Problems with Ubuntu mount /tmp

Relations between two reciprocal partial derivatives?

Make it rain characters

Is every episode of "Where are my Pants?" identical?

Is it ok to offer lower paid work as a trial period before negotiating for a full-time job?

Did the new image of black hole confirm the general theory of relativity?

What was the last x86 CPU that did not have the x87 floating-point unit built in?

Semisimplicity of the category of coherent sheaves?

Why is superheterodyning better than direct conversion?

Was credit for the black hole image misattributed?

Derivation tree not rendering

Arduino Pro Micro - switch off LEDs

Single author papers against my advisor's will?

A pet rabbit called Belle



How to analyse data after applying pandas' groupby function?



The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
The Ask Question Wizard is Live!
Data science time! April 2019 and salary with experienceHow to flush output of print function?How to make a chain of function decorators?Converting a Pandas GroupBy object to DataFrame“Large data” work flows using pandasChange data type of columns in PandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Python How to find average of columns using dataframes apply methodPandas groupby().apply() - returning None from the applied function messes up the resultsShaping the input for LSTM model using Keras Python



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















I have a data set of Olympic games medal winners. I am trying to find the country with most medals. How do I go about working with the series after applying groupby function?



Here is my data frame.



 ID Name Sex Age City Sport Medal
0 1 A Dijiang M 24.0 Barcelona Basketball Gold
1 2 A Lamusi M 23.0 London Judo Silver
...


I applied the following function to my data frame called qq:



zz = qq[qq.Medal =='Gold'].groupby(['NOC', 'Medal'])
zz.Medal.value_counts()

NOC Medal Medal
ALG Gold Gold 5
ANZ Gold Gold 20
ARG Gold Gold 91
ARM Gold Gold 2


After applying the function how can I analyse this zz series?



For example how can I return the country with maximum medals?
If I groupby without 'Gold' medal constraint, how can I count the sum of medals for each country?










share|improve this question













migrated from datascience.stackexchange.com Mar 31 at 16:35


This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.


















  • Does each and every record have a medal? In that case remove the qq.Medal =='Gold' and 'Medal'.

    – Esmailian
    Mar 31 at 9:01











  • No. Not every country won a medal.

    – a_a_a
    Mar 31 at 20:24











  • what are the possible values in Medal column?

    – Esmailian
    Mar 31 at 20:33











  • Gold Silver Bronze NaN

    – a_a_a
    Apr 1 at 5:05

















1















I have a data set of Olympic games medal winners. I am trying to find the country with most medals. How do I go about working with the series after applying groupby function?



Here is my data frame.



 ID Name Sex Age City Sport Medal
0 1 A Dijiang M 24.0 Barcelona Basketball Gold
1 2 A Lamusi M 23.0 London Judo Silver
...


I applied the following function to my data frame called qq:



zz = qq[qq.Medal =='Gold'].groupby(['NOC', 'Medal'])
zz.Medal.value_counts()

NOC Medal Medal
ALG Gold Gold 5
ANZ Gold Gold 20
ARG Gold Gold 91
ARM Gold Gold 2


After applying the function how can I analyse this zz series?



For example how can I return the country with maximum medals?
If I groupby without 'Gold' medal constraint, how can I count the sum of medals for each country?










share|improve this question













migrated from datascience.stackexchange.com Mar 31 at 16:35


This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.


















  • Does each and every record have a medal? In that case remove the qq.Medal =='Gold' and 'Medal'.

    – Esmailian
    Mar 31 at 9:01











  • No. Not every country won a medal.

    – a_a_a
    Mar 31 at 20:24











  • what are the possible values in Medal column?

    – Esmailian
    Mar 31 at 20:33











  • Gold Silver Bronze NaN

    – a_a_a
    Apr 1 at 5:05













1












1








1








I have a data set of Olympic games medal winners. I am trying to find the country with most medals. How do I go about working with the series after applying groupby function?



Here is my data frame.



 ID Name Sex Age City Sport Medal
0 1 A Dijiang M 24.0 Barcelona Basketball Gold
1 2 A Lamusi M 23.0 London Judo Silver
...


I applied the following function to my data frame called qq:



zz = qq[qq.Medal =='Gold'].groupby(['NOC', 'Medal'])
zz.Medal.value_counts()

NOC Medal Medal
ALG Gold Gold 5
ANZ Gold Gold 20
ARG Gold Gold 91
ARM Gold Gold 2


After applying the function how can I analyse this zz series?



For example how can I return the country with maximum medals?
If I groupby without 'Gold' medal constraint, how can I count the sum of medals for each country?










share|improve this question














I have a data set of Olympic games medal winners. I am trying to find the country with most medals. How do I go about working with the series after applying groupby function?



Here is my data frame.



 ID Name Sex Age City Sport Medal
0 1 A Dijiang M 24.0 Barcelona Basketball Gold
1 2 A Lamusi M 23.0 London Judo Silver
...


I applied the following function to my data frame called qq:



zz = qq[qq.Medal =='Gold'].groupby(['NOC', 'Medal'])
zz.Medal.value_counts()

NOC Medal Medal
ALG Gold Gold 5
ANZ Gold Gold 20
ARG Gold Gold 91
ARM Gold Gold 2


After applying the function how can I analyse this zz series?



For example how can I return the country with maximum medals?
If I groupby without 'Gold' medal constraint, how can I count the sum of medals for each country?







python dataset pandas






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 31 at 2:59









a_a_aa_a_a

968




968




migrated from datascience.stackexchange.com Mar 31 at 16:35


This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.









migrated from datascience.stackexchange.com Mar 31 at 16:35


This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.














  • Does each and every record have a medal? In that case remove the qq.Medal =='Gold' and 'Medal'.

    – Esmailian
    Mar 31 at 9:01











  • No. Not every country won a medal.

    – a_a_a
    Mar 31 at 20:24











  • what are the possible values in Medal column?

    – Esmailian
    Mar 31 at 20:33











  • Gold Silver Bronze NaN

    – a_a_a
    Apr 1 at 5:05

















  • Does each and every record have a medal? In that case remove the qq.Medal =='Gold' and 'Medal'.

    – Esmailian
    Mar 31 at 9:01











  • No. Not every country won a medal.

    – a_a_a
    Mar 31 at 20:24











  • what are the possible values in Medal column?

    – Esmailian
    Mar 31 at 20:33











  • Gold Silver Bronze NaN

    – a_a_a
    Apr 1 at 5:05
















Does each and every record have a medal? In that case remove the qq.Medal =='Gold' and 'Medal'.

– Esmailian
Mar 31 at 9:01





Does each and every record have a medal? In that case remove the qq.Medal =='Gold' and 'Medal'.

– Esmailian
Mar 31 at 9:01













No. Not every country won a medal.

– a_a_a
Mar 31 at 20:24





No. Not every country won a medal.

– a_a_a
Mar 31 at 20:24













what are the possible values in Medal column?

– Esmailian
Mar 31 at 20:33





what are the possible values in Medal column?

– Esmailian
Mar 31 at 20:33













Gold Silver Bronze NaN

– a_a_a
Apr 1 at 5:05





Gold Silver Bronze NaN

– a_a_a
Apr 1 at 5:05












1 Answer
1






active

oldest

votes


















0














You need to first filter out the NaN medals, and then aggregate. Here is an example:



import pandas as pd

df = pd.DataFrame([['USA', 'Gold'],
['USA', 'Bronze'],
['USA', 'NaN'],
['UK', 'Silver'],
['UK', 'NaN']],
columns=['NOC', 'Medal'])

valid_medals = df[df['Medal'] != 'NaN']
medal_count = valid_medals.groupby(['NOC'], as_index=False)
.count().sort_values(by=['Medal'],ascending=False)
print(medal_count)
print('Top country:')
print(medal_count.iloc[0])


Output:



 NOC Medal
1 USA 2
0 UK 1
Top country:
NOC USA
Medal 2
Name: 1, dtype: object





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55443057%2fhow-to-analyse-data-after-applying-pandas-groupby-function%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    You need to first filter out the NaN medals, and then aggregate. Here is an example:



    import pandas as pd

    df = pd.DataFrame([['USA', 'Gold'],
    ['USA', 'Bronze'],
    ['USA', 'NaN'],
    ['UK', 'Silver'],
    ['UK', 'NaN']],
    columns=['NOC', 'Medal'])

    valid_medals = df[df['Medal'] != 'NaN']
    medal_count = valid_medals.groupby(['NOC'], as_index=False)
    .count().sort_values(by=['Medal'],ascending=False)
    print(medal_count)
    print('Top country:')
    print(medal_count.iloc[0])


    Output:



     NOC Medal
    1 USA 2
    0 UK 1
    Top country:
    NOC USA
    Medal 2
    Name: 1, dtype: object





    share|improve this answer



























      0














      You need to first filter out the NaN medals, and then aggregate. Here is an example:



      import pandas as pd

      df = pd.DataFrame([['USA', 'Gold'],
      ['USA', 'Bronze'],
      ['USA', 'NaN'],
      ['UK', 'Silver'],
      ['UK', 'NaN']],
      columns=['NOC', 'Medal'])

      valid_medals = df[df['Medal'] != 'NaN']
      medal_count = valid_medals.groupby(['NOC'], as_index=False)
      .count().sort_values(by=['Medal'],ascending=False)
      print(medal_count)
      print('Top country:')
      print(medal_count.iloc[0])


      Output:



       NOC Medal
      1 USA 2
      0 UK 1
      Top country:
      NOC USA
      Medal 2
      Name: 1, dtype: object





      share|improve this answer

























        0












        0








        0







        You need to first filter out the NaN medals, and then aggregate. Here is an example:



        import pandas as pd

        df = pd.DataFrame([['USA', 'Gold'],
        ['USA', 'Bronze'],
        ['USA', 'NaN'],
        ['UK', 'Silver'],
        ['UK', 'NaN']],
        columns=['NOC', 'Medal'])

        valid_medals = df[df['Medal'] != 'NaN']
        medal_count = valid_medals.groupby(['NOC'], as_index=False)
        .count().sort_values(by=['Medal'],ascending=False)
        print(medal_count)
        print('Top country:')
        print(medal_count.iloc[0])


        Output:



         NOC Medal
        1 USA 2
        0 UK 1
        Top country:
        NOC USA
        Medal 2
        Name: 1, dtype: object





        share|improve this answer













        You need to first filter out the NaN medals, and then aggregate. Here is an example:



        import pandas as pd

        df = pd.DataFrame([['USA', 'Gold'],
        ['USA', 'Bronze'],
        ['USA', 'NaN'],
        ['UK', 'Silver'],
        ['UK', 'NaN']],
        columns=['NOC', 'Medal'])

        valid_medals = df[df['Medal'] != 'NaN']
        medal_count = valid_medals.groupby(['NOC'], as_index=False)
        .count().sort_values(by=['Medal'],ascending=False)
        print(medal_count)
        print('Top country:')
        print(medal_count.iloc[0])


        Output:



         NOC Medal
        1 USA 2
        0 UK 1
        Top country:
        NOC USA
        Medal 2
        Name: 1, dtype: object






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Apr 1 at 13:12









        EsmailianEsmailian

        12112




        12112





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55443057%2fhow-to-analyse-data-after-applying-pandas-groupby-function%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

            Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

            Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High