Amazon Cloud Image istance most suited for R data mining Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsVM image for data science projectsData Science in C (or C++)Find most representative imageTraining Deep Nets on an Ordinary LaptopData Mining Gear/Goods Websites for Specific PricesK mean clustering method of data miningCloud computing with country-specific region for SwitzerlandWhich cloud platform to maximize my impact as a data scientist?Navigating the jungle of choices for scalable ML deploymentAmazon SageMaker input data?

New Order #6: Easter Egg

Special flights

How can I prevent/balance waiting and turtling as a response to cooldown mechanics

Would color changing eyes affect vision?

Relating to the President and obstruction, were Mueller's conclusions preordained?

What is the "studentd" process?

"klopfte jemand" or "jemand klopfte"?

What are the main differences between Stargate SG-1 cuts?

Google .dev domain strangely redirects to https

Project Euler #1 in C++

What does 丫 mean? 丫是什么意思？

Can two person see the same photon?

What does this say in Elvish?

Why weren't discrete x86 CPUs ever used in game hardware?

Is there public access to the Meteor Crater in Arizona?

Central Vacuuming: Is it worth it, and how does it compare to normal vacuuming?

Tannaka duality for semisimple groups

The test team as an enemy of development? And how can this be avoided?

Asymptotics question

Can you force honesty by using the Speak with Dead and Zone of Truth spells together?

How would you say "es muy psicólogo"?

Connecting Mac Book Pro 2017 to 2 Projectors via USB C

How to write capital alpha?

Sally's older brother

Amazon Cloud Image istance most suited for R data mining

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)

2019 Moderator Election Q&A - Questionnaire

2019 Community Moderator Election ResultsVM image for data science projectsData Science in C (or C++)Find most representative imageTraining Deep Nets on an Ordinary LaptopData Mining Gear/Goods Websites for Specific PricesK mean clustering method of data miningCloud computing with country-specific region for SwitzerlandWhich cloud platform to maximize my impact as a data scientist?Navigating the jungle of choices for scalable ML deploymentAmazon SageMaker input data?

I'm new to the field of machine learning, I always used my laptop for regular statistical analysis with no performance problems. Though lately I started programming with caret and I find myself stuck with hours optimizing models and resampling datasets. I saw EC2 istances but I can't understand the difference between the different classes, I know that generally the one with highest numbers of cpus and RAM are most performant but what is the best type of istance for R programming (for example p one vs c one)? Then I'll chose the right amount of memory for my applications but I'm wondering if one subset is more suited than the others

asked Apr 3 at 11:12

GGA

1206

add a comment |

asked Apr 3 at 11:12

GGA

1206

add a comment |

asked Apr 3 at 11:12

GGA

1206

machine-learning r cloud-computing

asked Apr 3 at 11:12

GGA

1206

asked Apr 3 at 11:12

GGA

1206

asked Apr 3 at 11:12

GGA

1206

asked Apr 3 at 11:12

GGA

1206

asked Apr 3 at 11:12

GGA

1206

add a comment |

1 Answer
1

active

oldest

votes

It very much depends on the calculations you are doing as well as the tools you are using to implement them and the size of data you are working with.

Some small rules of thumb:

R processes (generally) tend towards being RAM bound, so you want may want to get a RAM optimised

R Studio IDE has a profiling tool you can use to check how your code executes and where the time is spent

It may be easier and cheaper to optimise code before scaling with tools like data.table and RCPP
- ML models are likely not going to be effected by this, more your data prep, check the profiler

If you are REALLY keen on getting the model running fast, look into the GPU/TPU instances, however, in the case of R using caret, I don't know if it will utilise these assets. Do your research first as these are the most expensive flavour of EC2. TPU units are (from my limited research) specifically optimised to tensorflow ML.

A plan B/alternative would be to work in aws Sagemaker notebooks to start. Then you can abstract away all the faff of managing the EC2 and just focus on building the ML.

For extra credit on your EC2, use one of the R community AMI's put together by this lovely person: http://www.louisaslett.com/RStudio_AMI/

answered Apr 10 at 8:42

DaveRGP

1313

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48512%2famazon-cloud-image-istance-most-suited-for-r-data-mining%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

It very much depends on the calculations you are doing as well as the tools you are using to implement them and the size of data you are working with.

Some small rules of thumb:

R processes (generally) tend towards being RAM bound, so you want may want to get a RAM optimised

R Studio IDE has a profiling tool you can use to check how your code executes and where the time is spent

It may be easier and cheaper to optimise code before scaling with tools like data.table and RCPP
- ML models are likely not going to be effected by this, more your data prep, check the profiler

If you are REALLY keen on getting the model running fast, look into the GPU/TPU instances, however, in the case of R using caret, I don't know if it will utilise these assets. Do your research first as these are the most expensive flavour of EC2. TPU units are (from my limited research) specifically optimised to tensorflow ML.

A plan B/alternative would be to work in aws Sagemaker notebooks to start. Then you can abstract away all the faff of managing the EC2 and just focus on building the ML.

For extra credit on your EC2, use one of the R community AMI's put together by this lovely person: http://www.louisaslett.com/RStudio_AMI/

answered Apr 10 at 8:42

DaveRGP

1313

add a comment |

It very much depends on the calculations you are doing as well as the tools you are using to implement them and the size of data you are working with.

Some small rules of thumb:

R processes (generally) tend towards being RAM bound, so you want may want to get a RAM optimised

R Studio IDE has a profiling tool you can use to check how your code executes and where the time is spent

It may be easier and cheaper to optimise code before scaling with tools like data.table and RCPP
- ML models are likely not going to be effected by this, more your data prep, check the profiler

If you are REALLY keen on getting the model running fast, look into the GPU/TPU instances, however, in the case of R using caret, I don't know if it will utilise these assets. Do your research first as these are the most expensive flavour of EC2. TPU units are (from my limited research) specifically optimised to tensorflow ML.

A plan B/alternative would be to work in aws Sagemaker notebooks to start. Then you can abstract away all the faff of managing the EC2 and just focus on building the ML.

For extra credit on your EC2, use one of the R community AMI's put together by this lovely person: http://www.louisaslett.com/RStudio_AMI/

answered Apr 10 at 8:42

DaveRGP

1313

add a comment |

It very much depends on the calculations you are doing as well as the tools you are using to implement them and the size of data you are working with.

Some small rules of thumb:

R processes (generally) tend towards being RAM bound, so you want may want to get a RAM optimised

R Studio IDE has a profiling tool you can use to check how your code executes and where the time is spent

It may be easier and cheaper to optimise code before scaling with tools like data.table and RCPP
- ML models are likely not going to be effected by this, more your data prep, check the profiler

If you are REALLY keen on getting the model running fast, look into the GPU/TPU instances, however, in the case of R using caret, I don't know if it will utilise these assets. Do your research first as these are the most expensive flavour of EC2. TPU units are (from my limited research) specifically optimised to tensorflow ML.

A plan B/alternative would be to work in aws Sagemaker notebooks to start. Then you can abstract away all the faff of managing the EC2 and just focus on building the ML.

For extra credit on your EC2, use one of the R community AMI's put together by this lovely person: http://www.louisaslett.com/RStudio_AMI/

answered Apr 10 at 8:42

DaveRGP

1313

It very much depends on the calculations you are doing as well as the tools you are using to implement them and the size of data you are working with.

Some small rules of thumb:

R processes (generally) tend towards being RAM bound, so you want may want to get a RAM optimised

R Studio IDE has a profiling tool you can use to check how your code executes and where the time is spent

It may be easier and cheaper to optimise code before scaling with tools like data.table and RCPP
- ML models are likely not going to be effected by this, more your data prep, check the profiler

If you are REALLY keen on getting the model running fast, look into the GPU/TPU instances, however, in the case of R using caret, I don't know if it will utilise these assets. Do your research first as these are the most expensive flavour of EC2. TPU units are (from my limited research) specifically optimised to tensorflow ML.

A plan B/alternative would be to work in aws Sagemaker notebooks to start. Then you can abstract away all the faff of managing the EC2 and just focus on building the ML.

For extra credit on your EC2, use one of the R community AMI's put together by this lovely person: http://www.louisaslett.com/RStudio_AMI/

answered Apr 10 at 8:42

DaveRGP

1313

answered Apr 10 at 8:42

DaveRGP

1313

answered Apr 10 at 8:42

DaveRGP

1313

answered Apr 10 at 8:42

DaveRGP

1313

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

aSf1dDowMCL CvN pUICyTWgPNHJdn IbCgW2zLfRotFGjXhVUT Htd9S,SYtkbi0PmOTMRTZ4nxAuX PQmXeJr,xm

搜尋此網誌

Trjtdtk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer
1

1 Answer
1

1 Answer
1