Fastest way to pop N items from a large dict2019 Community Moderator ElectionWhat is the fastest way to get the value of π?Fastest way to convert string to integer in PHPFastest way to determine if an integer's square root is an integerHow to randomly select an item from a list?How to remove items from a list while iterating?Fastest way to list all primes below NHow to remove the first Item from a list?Fastest way to check if a value exist in a listIs there any pythonic way to combine two dicts (adding values for keys that appear in both)?Fastest way to determine if an integer is between two integers (inclusive) with known sets of values

What kind of footwear is suitable for walking in micro gravity environment?

If I cast the Enlarge/Reduce spell on an arrow, what weapon could it count as?

Animal R'aim of the midrash

Homology of the fiber

What (if any) is the reason to buy in small local stores?

Do people actually use the word "kaputt" in conversation?

Help with identifying unique aircraft over NE Pennsylvania

categorizing a variable turns it from insignificant to significant

Can other pieces capture a threatening piece and prevent a checkmate?

Is this Pascal's Matrix?

Exit shell with shortcut (not typing exit) that closes session properly

Are hand made posters acceptable in Academia?

Can "few" be used as a subject? If so, what is the rule?

How do you justify more code being written by following clean code practices?

I got the following comment from a reputed math journal. What does it mean?

How to determine the greatest d orbital splitting?

Magento 2: Make category field required in product form in backend

When did hardware antialiasing start being available?

How can an organ that provides biological immortality be unable to regenerate?

Print a physical multiplication table

Have any astronauts/cosmonauts died in space?

Is there any common country to visit for uk and schengen visa?

The multiplication of list of matrices

Why is participating in the European Parliamentary elections used as a threat?



Fastest way to pop N items from a large dict



2019 Community Moderator ElectionWhat is the fastest way to get the value of π?Fastest way to convert string to integer in PHPFastest way to determine if an integer's square root is an integerHow to randomly select an item from a list?How to remove items from a list while iterating?Fastest way to list all primes below NHow to remove the first Item from a list?Fastest way to check if a value exist in a listIs there any pythonic way to combine two dicts (adding values for keys that appear in both)?Fastest way to determine if an integer is between two integers (inclusive) with known sets of values










14















I have a large dict src (up to 1M items) and I would like to take N (typical values would be N=10K-20K) items, store them in a new dict dst and leave only the remaining items in src. It doesn't matter which N items are taken. I'm looking for the fastest way to do it on Python 3.6 or 3.7.



Fastest approach I've found so far:



src = i: i ** 3 for i in range(1000000)

# Taking items 1 by 1 (~0.0059s)
dst =
while len(dst) < 20000:
item = src.popitem()
dst[item[0]] = item[1]


Is there anything better? Even a marginal gain would be good.










share|improve this question




























    14















    I have a large dict src (up to 1M items) and I would like to take N (typical values would be N=10K-20K) items, store them in a new dict dst and leave only the remaining items in src. It doesn't matter which N items are taken. I'm looking for the fastest way to do it on Python 3.6 or 3.7.



    Fastest approach I've found so far:



    src = i: i ** 3 for i in range(1000000)

    # Taking items 1 by 1 (~0.0059s)
    dst =
    while len(dst) < 20000:
    item = src.popitem()
    dst[item[0]] = item[1]


    Is there anything better? Even a marginal gain would be good.










    share|improve this question


























      14












      14








      14


      1






      I have a large dict src (up to 1M items) and I would like to take N (typical values would be N=10K-20K) items, store them in a new dict dst and leave only the remaining items in src. It doesn't matter which N items are taken. I'm looking for the fastest way to do it on Python 3.6 or 3.7.



      Fastest approach I've found so far:



      src = i: i ** 3 for i in range(1000000)

      # Taking items 1 by 1 (~0.0059s)
      dst =
      while len(dst) < 20000:
      item = src.popitem()
      dst[item[0]] = item[1]


      Is there anything better? Even a marginal gain would be good.










      share|improve this question
















      I have a large dict src (up to 1M items) and I would like to take N (typical values would be N=10K-20K) items, store them in a new dict dst and leave only the remaining items in src. It doesn't matter which N items are taken. I'm looking for the fastest way to do it on Python 3.6 or 3.7.



      Fastest approach I've found so far:



      src = i: i ** 3 for i in range(1000000)

      # Taking items 1 by 1 (~0.0059s)
      dst =
      while len(dst) < 20000:
      item = src.popitem()
      dst[item[0]] = item[1]


      Is there anything better? Even a marginal gain would be good.







      python python-3.x performance dictionary optimization






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 2 days ago









      martineau

      69.2k1092186




      69.2k1092186










      asked 2 days ago









      Ivailo KaramanolevIvailo Karamanolev

      560615




      560615






















          3 Answers
          3






          active

          oldest

          votes


















          6














          This is a bit faster still:



          from itertools import islice
          def method_4(d):
          result = dict(islice(d.items(), 20000))
          for k in result: del d[k]
          return result


          Compared to other versions, using Netwave's testcase:



          Method 1: 0.004459443036466837 # original
          Method 2: 0.0034434819826856256 # Netwave
          Method 3: 0.002602717955596745 # chepner
          Method 4: 0.001974945073015988 # this answer


          The extra speedup seems to come from avoiding transitions between C and Python functions. From disassembly we can note that the dict instantiation happens on C side, with only 3 function calls from Python. The loop uses DELETE_SUBSCR opcode instead of needing a function call:



          >>> dis.dis(method_4)
          2 0 LOAD_GLOBAL 0 (dict)
          2 LOAD_GLOBAL 1 (islice)
          4 LOAD_FAST 0 (d)
          6 LOAD_ATTR 2 (items)
          8 CALL_FUNCTION 0
          10 LOAD_CONST 1 (20000)
          12 CALL_FUNCTION 2
          14 CALL_FUNCTION 1
          16 STORE_FAST 1 (result)

          3 18 SETUP_LOOP 18 (to 38)
          20 LOAD_FAST 1 (result)
          22 GET_ITER
          >> 24 FOR_ITER 10 (to 36)
          26 STORE_FAST 2 (k)
          28 LOAD_FAST 0 (d)
          30 LOAD_FAST 2 (k)
          32 DELETE_SUBSCR
          34 JUMP_ABSOLUTE 24
          >> 36 POP_BLOCK

          4 >> 38 LOAD_FAST 1 (result)
          40 RETURN_VALUE


          Compared with the iterator in method_2:



          >>> dis.dis(d.popitem() for _ in range(20000))
          1 0 LOAD_FAST 0 (.0)
          >> 2 FOR_ITER 14 (to 18)
          4 STORE_FAST 1 (_)
          6 LOAD_GLOBAL 0 (d)
          8 LOAD_ATTR 1 (popitem)
          10 CALL_FUNCTION 0
          12 YIELD_VALUE
          14 POP_TOP
          16 JUMP_ABSOLUTE 2
          >> 18 LOAD_CONST 0 (None)
          20 RETURN_VALUE


          which needs a Python to C function call for each item.






          share|improve this answer

























          • I was researching this! what a sync!

            – Netwave
            yesterday











          • @Netwave Do you think this should now be the accepted answer?

            – Ivailo Karamanolev
            yesterday











          • @IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

            – Netwave
            yesterday


















          15














          A simple comprehension inside dict will do:



          dict(src.popitem() for _ in range(20000))


          Here you have the timing tests



          setup = """
          src = i: i ** 3 for i in range(1000000)

          def method_1(d):
          dst =
          while len(dst) < 20000:
          item = d.popitem()
          dst[item[0]] = item[1]
          return dst

          def method_2(d):
          return dict(d.popitem() for _ in range(20000))
          """
          import timeit
          print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))

          print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))


          Results:



          Method 1: 0.007701821999944514
          Method 2: 0.004668198998842854





          share|improve this answer




















          • 3





            so much for dict comprehension. Good one using dict!!!

            – Jean-François Fabre
            2 days ago






          • 2





            Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

            – Netwave
            2 days ago






          • 1





            for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

            – Jean-François Fabre
            2 days ago







          • 4





            You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

            – chepner
            2 days ago






          • 3





            Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

            – chepner
            2 days ago


















          2














          I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values



          dst = key:value for key,value in (src.popitem() for _ in range(20000))


          on my machine:



          your code: 0.00899505615234375
          my code: 0.007996797561645508


          so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer



          This approach can be useful if you want to transform the keys or values in the process.






          share|improve this answer
























            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55199303%2ffastest-way-to-pop-n-items-from-a-large-dict%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            6














            This is a bit faster still:



            from itertools import islice
            def method_4(d):
            result = dict(islice(d.items(), 20000))
            for k in result: del d[k]
            return result


            Compared to other versions, using Netwave's testcase:



            Method 1: 0.004459443036466837 # original
            Method 2: 0.0034434819826856256 # Netwave
            Method 3: 0.002602717955596745 # chepner
            Method 4: 0.001974945073015988 # this answer


            The extra speedup seems to come from avoiding transitions between C and Python functions. From disassembly we can note that the dict instantiation happens on C side, with only 3 function calls from Python. The loop uses DELETE_SUBSCR opcode instead of needing a function call:



            >>> dis.dis(method_4)
            2 0 LOAD_GLOBAL 0 (dict)
            2 LOAD_GLOBAL 1 (islice)
            4 LOAD_FAST 0 (d)
            6 LOAD_ATTR 2 (items)
            8 CALL_FUNCTION 0
            10 LOAD_CONST 1 (20000)
            12 CALL_FUNCTION 2
            14 CALL_FUNCTION 1
            16 STORE_FAST 1 (result)

            3 18 SETUP_LOOP 18 (to 38)
            20 LOAD_FAST 1 (result)
            22 GET_ITER
            >> 24 FOR_ITER 10 (to 36)
            26 STORE_FAST 2 (k)
            28 LOAD_FAST 0 (d)
            30 LOAD_FAST 2 (k)
            32 DELETE_SUBSCR
            34 JUMP_ABSOLUTE 24
            >> 36 POP_BLOCK

            4 >> 38 LOAD_FAST 1 (result)
            40 RETURN_VALUE


            Compared with the iterator in method_2:



            >>> dis.dis(d.popitem() for _ in range(20000))
            1 0 LOAD_FAST 0 (.0)
            >> 2 FOR_ITER 14 (to 18)
            4 STORE_FAST 1 (_)
            6 LOAD_GLOBAL 0 (d)
            8 LOAD_ATTR 1 (popitem)
            10 CALL_FUNCTION 0
            12 YIELD_VALUE
            14 POP_TOP
            16 JUMP_ABSOLUTE 2
            >> 18 LOAD_CONST 0 (None)
            20 RETURN_VALUE


            which needs a Python to C function call for each item.






            share|improve this answer

























            • I was researching this! what a sync!

              – Netwave
              yesterday











            • @Netwave Do you think this should now be the accepted answer?

              – Ivailo Karamanolev
              yesterday











            • @IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

              – Netwave
              yesterday















            6














            This is a bit faster still:



            from itertools import islice
            def method_4(d):
            result = dict(islice(d.items(), 20000))
            for k in result: del d[k]
            return result


            Compared to other versions, using Netwave's testcase:



            Method 1: 0.004459443036466837 # original
            Method 2: 0.0034434819826856256 # Netwave
            Method 3: 0.002602717955596745 # chepner
            Method 4: 0.001974945073015988 # this answer


            The extra speedup seems to come from avoiding transitions between C and Python functions. From disassembly we can note that the dict instantiation happens on C side, with only 3 function calls from Python. The loop uses DELETE_SUBSCR opcode instead of needing a function call:



            >>> dis.dis(method_4)
            2 0 LOAD_GLOBAL 0 (dict)
            2 LOAD_GLOBAL 1 (islice)
            4 LOAD_FAST 0 (d)
            6 LOAD_ATTR 2 (items)
            8 CALL_FUNCTION 0
            10 LOAD_CONST 1 (20000)
            12 CALL_FUNCTION 2
            14 CALL_FUNCTION 1
            16 STORE_FAST 1 (result)

            3 18 SETUP_LOOP 18 (to 38)
            20 LOAD_FAST 1 (result)
            22 GET_ITER
            >> 24 FOR_ITER 10 (to 36)
            26 STORE_FAST 2 (k)
            28 LOAD_FAST 0 (d)
            30 LOAD_FAST 2 (k)
            32 DELETE_SUBSCR
            34 JUMP_ABSOLUTE 24
            >> 36 POP_BLOCK

            4 >> 38 LOAD_FAST 1 (result)
            40 RETURN_VALUE


            Compared with the iterator in method_2:



            >>> dis.dis(d.popitem() for _ in range(20000))
            1 0 LOAD_FAST 0 (.0)
            >> 2 FOR_ITER 14 (to 18)
            4 STORE_FAST 1 (_)
            6 LOAD_GLOBAL 0 (d)
            8 LOAD_ATTR 1 (popitem)
            10 CALL_FUNCTION 0
            12 YIELD_VALUE
            14 POP_TOP
            16 JUMP_ABSOLUTE 2
            >> 18 LOAD_CONST 0 (None)
            20 RETURN_VALUE


            which needs a Python to C function call for each item.






            share|improve this answer

























            • I was researching this! what a sync!

              – Netwave
              yesterday











            • @Netwave Do you think this should now be the accepted answer?

              – Ivailo Karamanolev
              yesterday











            • @IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

              – Netwave
              yesterday













            6












            6








            6







            This is a bit faster still:



            from itertools import islice
            def method_4(d):
            result = dict(islice(d.items(), 20000))
            for k in result: del d[k]
            return result


            Compared to other versions, using Netwave's testcase:



            Method 1: 0.004459443036466837 # original
            Method 2: 0.0034434819826856256 # Netwave
            Method 3: 0.002602717955596745 # chepner
            Method 4: 0.001974945073015988 # this answer


            The extra speedup seems to come from avoiding transitions between C and Python functions. From disassembly we can note that the dict instantiation happens on C side, with only 3 function calls from Python. The loop uses DELETE_SUBSCR opcode instead of needing a function call:



            >>> dis.dis(method_4)
            2 0 LOAD_GLOBAL 0 (dict)
            2 LOAD_GLOBAL 1 (islice)
            4 LOAD_FAST 0 (d)
            6 LOAD_ATTR 2 (items)
            8 CALL_FUNCTION 0
            10 LOAD_CONST 1 (20000)
            12 CALL_FUNCTION 2
            14 CALL_FUNCTION 1
            16 STORE_FAST 1 (result)

            3 18 SETUP_LOOP 18 (to 38)
            20 LOAD_FAST 1 (result)
            22 GET_ITER
            >> 24 FOR_ITER 10 (to 36)
            26 STORE_FAST 2 (k)
            28 LOAD_FAST 0 (d)
            30 LOAD_FAST 2 (k)
            32 DELETE_SUBSCR
            34 JUMP_ABSOLUTE 24
            >> 36 POP_BLOCK

            4 >> 38 LOAD_FAST 1 (result)
            40 RETURN_VALUE


            Compared with the iterator in method_2:



            >>> dis.dis(d.popitem() for _ in range(20000))
            1 0 LOAD_FAST 0 (.0)
            >> 2 FOR_ITER 14 (to 18)
            4 STORE_FAST 1 (_)
            6 LOAD_GLOBAL 0 (d)
            8 LOAD_ATTR 1 (popitem)
            10 CALL_FUNCTION 0
            12 YIELD_VALUE
            14 POP_TOP
            16 JUMP_ABSOLUTE 2
            >> 18 LOAD_CONST 0 (None)
            20 RETURN_VALUE


            which needs a Python to C function call for each item.






            share|improve this answer















            This is a bit faster still:



            from itertools import islice
            def method_4(d):
            result = dict(islice(d.items(), 20000))
            for k in result: del d[k]
            return result


            Compared to other versions, using Netwave's testcase:



            Method 1: 0.004459443036466837 # original
            Method 2: 0.0034434819826856256 # Netwave
            Method 3: 0.002602717955596745 # chepner
            Method 4: 0.001974945073015988 # this answer


            The extra speedup seems to come from avoiding transitions between C and Python functions. From disassembly we can note that the dict instantiation happens on C side, with only 3 function calls from Python. The loop uses DELETE_SUBSCR opcode instead of needing a function call:



            >>> dis.dis(method_4)
            2 0 LOAD_GLOBAL 0 (dict)
            2 LOAD_GLOBAL 1 (islice)
            4 LOAD_FAST 0 (d)
            6 LOAD_ATTR 2 (items)
            8 CALL_FUNCTION 0
            10 LOAD_CONST 1 (20000)
            12 CALL_FUNCTION 2
            14 CALL_FUNCTION 1
            16 STORE_FAST 1 (result)

            3 18 SETUP_LOOP 18 (to 38)
            20 LOAD_FAST 1 (result)
            22 GET_ITER
            >> 24 FOR_ITER 10 (to 36)
            26 STORE_FAST 2 (k)
            28 LOAD_FAST 0 (d)
            30 LOAD_FAST 2 (k)
            32 DELETE_SUBSCR
            34 JUMP_ABSOLUTE 24
            >> 36 POP_BLOCK

            4 >> 38 LOAD_FAST 1 (result)
            40 RETURN_VALUE


            Compared with the iterator in method_2:



            >>> dis.dis(d.popitem() for _ in range(20000))
            1 0 LOAD_FAST 0 (.0)
            >> 2 FOR_ITER 14 (to 18)
            4 STORE_FAST 1 (_)
            6 LOAD_GLOBAL 0 (d)
            8 LOAD_ATTR 1 (popitem)
            10 CALL_FUNCTION 0
            12 YIELD_VALUE
            14 POP_TOP
            16 JUMP_ABSOLUTE 2
            >> 18 LOAD_CONST 0 (None)
            20 RETURN_VALUE


            which needs a Python to C function call for each item.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited yesterday

























            answered 2 days ago









            jpajpa

            5,4481226




            5,4481226












            • I was researching this! what a sync!

              – Netwave
              yesterday











            • @Netwave Do you think this should now be the accepted answer?

              – Ivailo Karamanolev
              yesterday











            • @IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

              – Netwave
              yesterday

















            • I was researching this! what a sync!

              – Netwave
              yesterday











            • @Netwave Do you think this should now be the accepted answer?

              – Ivailo Karamanolev
              yesterday











            • @IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

              – Netwave
              yesterday
















            I was researching this! what a sync!

            – Netwave
            yesterday





            I was researching this! what a sync!

            – Netwave
            yesterday













            @Netwave Do you think this should now be the accepted answer?

            – Ivailo Karamanolev
            yesterday





            @Netwave Do you think this should now be the accepted answer?

            – Ivailo Karamanolev
            yesterday













            @IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

            – Netwave
            yesterday





            @IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

            – Netwave
            yesterday













            15














            A simple comprehension inside dict will do:



            dict(src.popitem() for _ in range(20000))


            Here you have the timing tests



            setup = """
            src = i: i ** 3 for i in range(1000000)

            def method_1(d):
            dst =
            while len(dst) < 20000:
            item = d.popitem()
            dst[item[0]] = item[1]
            return dst

            def method_2(d):
            return dict(d.popitem() for _ in range(20000))
            """
            import timeit
            print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))

            print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))


            Results:



            Method 1: 0.007701821999944514
            Method 2: 0.004668198998842854





            share|improve this answer




















            • 3





              so much for dict comprehension. Good one using dict!!!

              – Jean-François Fabre
              2 days ago






            • 2





              Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

              – Netwave
              2 days ago






            • 1





              for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

              – Jean-François Fabre
              2 days ago







            • 4





              You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

              – chepner
              2 days ago






            • 3





              Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

              – chepner
              2 days ago















            15














            A simple comprehension inside dict will do:



            dict(src.popitem() for _ in range(20000))


            Here you have the timing tests



            setup = """
            src = i: i ** 3 for i in range(1000000)

            def method_1(d):
            dst =
            while len(dst) < 20000:
            item = d.popitem()
            dst[item[0]] = item[1]
            return dst

            def method_2(d):
            return dict(d.popitem() for _ in range(20000))
            """
            import timeit
            print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))

            print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))


            Results:



            Method 1: 0.007701821999944514
            Method 2: 0.004668198998842854





            share|improve this answer




















            • 3





              so much for dict comprehension. Good one using dict!!!

              – Jean-François Fabre
              2 days ago






            • 2





              Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

              – Netwave
              2 days ago






            • 1





              for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

              – Jean-François Fabre
              2 days ago







            • 4





              You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

              – chepner
              2 days ago






            • 3





              Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

              – chepner
              2 days ago













            15












            15








            15







            A simple comprehension inside dict will do:



            dict(src.popitem() for _ in range(20000))


            Here you have the timing tests



            setup = """
            src = i: i ** 3 for i in range(1000000)

            def method_1(d):
            dst =
            while len(dst) < 20000:
            item = d.popitem()
            dst[item[0]] = item[1]
            return dst

            def method_2(d):
            return dict(d.popitem() for _ in range(20000))
            """
            import timeit
            print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))

            print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))


            Results:



            Method 1: 0.007701821999944514
            Method 2: 0.004668198998842854





            share|improve this answer















            A simple comprehension inside dict will do:



            dict(src.popitem() for _ in range(20000))


            Here you have the timing tests



            setup = """
            src = i: i ** 3 for i in range(1000000)

            def method_1(d):
            dst =
            while len(dst) < 20000:
            item = d.popitem()
            dst[item[0]] = item[1]
            return dst

            def method_2(d):
            return dict(d.popitem() for _ in range(20000))
            """
            import timeit
            print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))

            print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))


            Results:



            Method 1: 0.007701821999944514
            Method 2: 0.004668198998842854






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited yesterday

























            answered 2 days ago









            NetwaveNetwave

            13.2k22246




            13.2k22246







            • 3





              so much for dict comprehension. Good one using dict!!!

              – Jean-François Fabre
              2 days ago






            • 2





              Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

              – Netwave
              2 days ago






            • 1





              for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

              – Jean-François Fabre
              2 days ago







            • 4





              You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

              – chepner
              2 days ago






            • 3





              Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

              – chepner
              2 days ago












            • 3





              so much for dict comprehension. Good one using dict!!!

              – Jean-François Fabre
              2 days ago






            • 2





              Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

              – Netwave
              2 days ago






            • 1





              for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

              – Jean-François Fabre
              2 days ago







            • 4





              You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

              – chepner
              2 days ago






            • 3





              Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

              – chepner
              2 days ago







            3




            3





            so much for dict comprehension. Good one using dict!!!

            – Jean-François Fabre
            2 days ago





            so much for dict comprehension. Good one using dict!!!

            – Jean-François Fabre
            2 days ago




            2




            2





            Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

            – Netwave
            2 days ago





            Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

            – Netwave
            2 days ago




            1




            1





            for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

            – Jean-François Fabre
            2 days ago






            for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

            – Jean-François Fabre
            2 days ago





            4




            4





            You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

            – chepner
            2 days ago





            You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

            – chepner
            2 days ago




            3




            3





            Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

            – chepner
            2 days ago





            Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

            – chepner
            2 days ago











            2














            I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values



            dst = key:value for key,value in (src.popitem() for _ in range(20000))


            on my machine:



            your code: 0.00899505615234375
            my code: 0.007996797561645508


            so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer



            This approach can be useful if you want to transform the keys or values in the process.






            share|improve this answer





























              2














              I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values



              dst = key:value for key,value in (src.popitem() for _ in range(20000))


              on my machine:



              your code: 0.00899505615234375
              my code: 0.007996797561645508


              so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer



              This approach can be useful if you want to transform the keys or values in the process.






              share|improve this answer



























                2












                2








                2







                I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values



                dst = key:value for key,value in (src.popitem() for _ in range(20000))


                on my machine:



                your code: 0.00899505615234375
                my code: 0.007996797561645508


                so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer



                This approach can be useful if you want to transform the keys or values in the process.






                share|improve this answer















                I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values



                dst = key:value for key,value in (src.popitem() for _ in range(20000))


                on my machine:



                your code: 0.00899505615234375
                my code: 0.007996797561645508


                so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer



                This approach can be useful if you want to transform the keys or values in the process.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 2 days ago

























                answered 2 days ago









                Jean-François FabreJean-François Fabre

                106k1057115




                106k1057115



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55199303%2ffastest-way-to-pop-n-items-from-a-large-dict%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                    Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

                    Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High