How to deal with memory insufficient read by pandas in pythonpandas dataframes memoryPandas: access fields within field in a DataFramePandas - read CSV with spanish charactersBest approach for this unsupervised clustering problem with categorical data?Pandas: how to read certain file type in pandasR: I need to understand the mechanism behind reading a subset of a file in R (e.g., via sqldf) or other data-science-centered programming languageOnline vs minibatch training for speedDifferent approaches of creating the test setMerging dataframes in Pandas is taking a surprisingly long timeEfficiently training big models on big dataframes with big samples, with crossvalidation and shuffling, and limited ram
Put the phone down / Put down the phone
Why the "ls" command is showing the permissions of files in a FAT32 partition?
Can a multiclassed 2019 UA artificer/Pact of the Blade warlock use Thirsting Blade and Arcane Armament to make 3 attacks per Attack action?
What is this high flying aircraft over Pennsylvania?
Travelling in US for more than 90 days
Adding up numbers in Portuguese is strange
Did I make a mistake by ccing email to boss to others?
A society in which the "family system" is similar to those of lions or monkeys
How to predict the next number in a series while having additional series of data that might affect it?
How would a solely written language work mechanically
Animation: customize bounce interpolation
Weird lines in Microsoft Word
What is the tangent at a sharp point on a curve?
How to add numbers in array using forEach
Can anyone precisely describe what it means (or feels like) to play exactly what your "inner ear" is hearing?
Asserting that Atheism and Theism are both faith based positions
If Captain Marvel (MCU) marries a human male, will they have human or Kree children?
Bash: Why does this Brace Expression work this way?
Isometric embedding of a genus g surface
Should I warn a new PhD Student?
Writing in a Christian voice
Can I say "fingers" when referring to toes?
How to preserve electronics (computers, iPads and phones) for hundreds of years
Make a Bowl of Alphabet Soup
How to deal with memory insufficient read by pandas in python
pandas dataframes memoryPandas: access fields within field in a DataFramePandas - read CSV with spanish charactersBest approach for this unsupervised clustering problem with categorical data?Pandas: how to read certain file type in pandasR: I need to understand the mechanism behind reading a subset of a file in R (e.g., via sqldf) or other data-science-centered programming languageOnline vs minibatch training for speedDifferent approaches of creating the test setMerging dataframes in Pandas is taking a surprisingly long timeEfficiently training big models on big dataframes with big samples, with crossvalidation and shuffling, and limited ram
$begingroup$
I use pandas.read_csv to read a huge file for machine learning, but I got memory error.
Someone recommend me to set arg chunksize but I need sort, random access...etc. So I need to load whole data into memory or use another way.
Some ways I think it's possible is Hadoop. Another one is incremental training, but it's like reading chunksize in read_csv
Or other software/library/ways can I use?
machine-learning pandas
$endgroup$
add a comment |
$begingroup$
I use pandas.read_csv to read a huge file for machine learning, but I got memory error.
Someone recommend me to set arg chunksize but I need sort, random access...etc. So I need to load whole data into memory or use another way.
Some ways I think it's possible is Hadoop. Another one is incremental training, but it's like reading chunksize in read_csv
Or other software/library/ways can I use?
machine-learning pandas
$endgroup$
1
$begingroup$
Check this out
$endgroup$
– Kiritee Gak
yesterday
add a comment |
$begingroup$
I use pandas.read_csv to read a huge file for machine learning, but I got memory error.
Someone recommend me to set arg chunksize but I need sort, random access...etc. So I need to load whole data into memory or use another way.
Some ways I think it's possible is Hadoop. Another one is incremental training, but it's like reading chunksize in read_csv
Or other software/library/ways can I use?
machine-learning pandas
$endgroup$
I use pandas.read_csv to read a huge file for machine learning, but I got memory error.
Someone recommend me to set arg chunksize but I need sort, random access...etc. So I need to load whole data into memory or use another way.
Some ways I think it's possible is Hadoop. Another one is incremental training, but it's like reading chunksize in read_csv
Or other software/library/ways can I use?
machine-learning pandas
machine-learning pandas
asked yesterday
code_workercode_worker
113
113
1
$begingroup$
Check this out
$endgroup$
– Kiritee Gak
yesterday
add a comment |
1
$begingroup$
Check this out
$endgroup$
– Kiritee Gak
yesterday
1
1
$begingroup$
Check this out
$endgroup$
– Kiritee Gak
yesterday
$begingroup$
Check this out
$endgroup$
– Kiritee Gak
yesterday
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
I would suggest you to use Dask. I used it successfully when I had to read large data with my 4GB RAM. You can get more details here.
To read a CSV, you can do the following:
import dask.dataframe as dd
csv_file = 'data.csv'
df = dd.read_csv(csv_file)
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47594%2fhow-to-deal-with-memory-insufficient-read-by-pandas-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I would suggest you to use Dask. I used it successfully when I had to read large data with my 4GB RAM. You can get more details here.
To read a CSV, you can do the following:
import dask.dataframe as dd
csv_file = 'data.csv'
df = dd.read_csv(csv_file)
$endgroup$
add a comment |
$begingroup$
I would suggest you to use Dask. I used it successfully when I had to read large data with my 4GB RAM. You can get more details here.
To read a CSV, you can do the following:
import dask.dataframe as dd
csv_file = 'data.csv'
df = dd.read_csv(csv_file)
$endgroup$
add a comment |
$begingroup$
I would suggest you to use Dask. I used it successfully when I had to read large data with my 4GB RAM. You can get more details here.
To read a CSV, you can do the following:
import dask.dataframe as dd
csv_file = 'data.csv'
df = dd.read_csv(csv_file)
$endgroup$
I would suggest you to use Dask. I used it successfully when I had to read large data with my 4GB RAM. You can get more details here.
To read a CSV, you can do the following:
import dask.dataframe as dd
csv_file = 'data.csv'
df = dd.read_csv(csv_file)
edited yesterday
Glorfindel
128119
128119
answered yesterday
InAFlashInAFlash
3521315
3521315
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47594%2fhow-to-deal-with-memory-insufficient-read-by-pandas-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
Check this out
$endgroup$
– Kiritee Gak
yesterday