How should we manage large databaseDo modern R and/or Python libraries make SQL obsolete?Manage x-axis using ggplot()Decode SQLite database blobs: how to start?Analytics oriented databaseHow to document operations on a database?Seeking advice on database architecture — given my problem, what tools should I learn?Database for storing nested s3 objects from broom, sweep, or tidy in RDatabase System for Manual EntryMachine Learning applied to database designGraph database using networkxConnect Orange 3.20 to postgresql database
Should one double the thirds or the fifth in chords?
A non-technological, repeating, phenomenon in the sky, holding its position in the sky for hours
Why do money exchangers give different rates to different bills?
How is the law in a case of multiple edim zomemim justified by Chachomim?
Why is C# in the D Major Scale?
I need a disease
CRT Oscilloscope - part of the plot is missing
Has any spacecraft ever had the ability to directly communicate with civilian air traffic control?
How to get a product new from and to date in phtml file in magento 2
Can the 歳 counter be used for architecture, furniture etc to tell its age?
Has a commercial or military jet bi-plane ever been manufactured?
What are the spoon bit of a spoon and fork bit of a fork called?
How to reply this mail from potential PhD professor?
Enumerate Derangements
Missed the connecting flight, separate tickets on same airline - who is responsible?
Pressure inside an infinite ocean?
In a vacuum triode, what prevents the grid from acting as another anode?
Can't remove one character of space in my environment
If a prion is a protein, why is it not disassembled by the digestive system?
Short story with physics professor who "brings back the dead" (Asimov or Bradbury?)
SQL Server Management Studio SSMS 18.0 General Availability release (GA) install fails
Besides the up and down quark, what other quarks are present in daily matter around us?
Airbnb - host wants to reduce rooms, can we get refund?
If Earth is tilted, why is Polaris always above the same spot?
How should we manage large database
Do modern R and/or Python libraries make SQL obsolete?Manage x-axis using ggplot()Decode SQLite database blobs: how to start?Analytics oriented databaseHow to document operations on a database?Seeking advice on database architecture — given my problem, what tools should I learn?Database for storing nested s3 objects from broom, sweep, or tidy in RDatabase System for Manual EntryMachine Learning applied to database designGraph database using networkxConnect Orange 3.20 to postgresql database
$begingroup$
We currently manage a large volume of economic data in Excel in my organisation. All of the data is downloaded from different online databases into Excel spreadsheets (one for each data frequency including annual, monthly, quarterly) - and then one main spreadsheet organises everything and creates tables that we need regularly. By organise, I mean that many of the things we need are simply identities ($Z=X+Y$ where we would have only downloaded data on $X$ and $Y$)
My view is that this could be done much more efficiently in R - where we'd automate the updating of the data and then spit out the tables that we need. But I am not trained at all in data management.
Would you all recommend a better way of doing this, or are there pitfalls to using R that I am not considering.
r databases excel
$endgroup$
add a comment |
$begingroup$
We currently manage a large volume of economic data in Excel in my organisation. All of the data is downloaded from different online databases into Excel spreadsheets (one for each data frequency including annual, monthly, quarterly) - and then one main spreadsheet organises everything and creates tables that we need regularly. By organise, I mean that many of the things we need are simply identities ($Z=X+Y$ where we would have only downloaded data on $X$ and $Y$)
My view is that this could be done much more efficiently in R - where we'd automate the updating of the data and then spit out the tables that we need. But I am not trained at all in data management.
Would you all recommend a better way of doing this, or are there pitfalls to using R that I am not considering.
r databases excel
$endgroup$
$begingroup$
You should consider using Apache Spark or Hadoop.
$endgroup$
– pythinker
Apr 11 at 13:40
add a comment |
$begingroup$
We currently manage a large volume of economic data in Excel in my organisation. All of the data is downloaded from different online databases into Excel spreadsheets (one for each data frequency including annual, monthly, quarterly) - and then one main spreadsheet organises everything and creates tables that we need regularly. By organise, I mean that many of the things we need are simply identities ($Z=X+Y$ where we would have only downloaded data on $X$ and $Y$)
My view is that this could be done much more efficiently in R - where we'd automate the updating of the data and then spit out the tables that we need. But I am not trained at all in data management.
Would you all recommend a better way of doing this, or are there pitfalls to using R that I am not considering.
r databases excel
$endgroup$
We currently manage a large volume of economic data in Excel in my organisation. All of the data is downloaded from different online databases into Excel spreadsheets (one for each data frequency including annual, monthly, quarterly) - and then one main spreadsheet organises everything and creates tables that we need regularly. By organise, I mean that many of the things we need are simply identities ($Z=X+Y$ where we would have only downloaded data on $X$ and $Y$)
My view is that this could be done much more efficiently in R - where we'd automate the updating of the data and then spit out the tables that we need. But I am not trained at all in data management.
Would you all recommend a better way of doing this, or are there pitfalls to using R that I am not considering.
r databases excel
r databases excel
edited Apr 11 at 13:09
Stephen Rauch♦
1,53551330
1,53551330
asked Apr 11 at 7:59
StephenBStephenB
11
11
$begingroup$
You should consider using Apache Spark or Hadoop.
$endgroup$
– pythinker
Apr 11 at 13:40
add a comment |
$begingroup$
You should consider using Apache Spark or Hadoop.
$endgroup$
– pythinker
Apr 11 at 13:40
$begingroup$
You should consider using Apache Spark or Hadoop.
$endgroup$
– pythinker
Apr 11 at 13:40
$begingroup$
You should consider using Apache Spark or Hadoop.
$endgroup$
– pythinker
Apr 11 at 13:40
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.
But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.
Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.
Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.
$endgroup$
$begingroup$
Actually, any programming language may do this function.
$endgroup$
– Juan Esteban de la Calle
Apr 17 at 12:49
$begingroup$
True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
$endgroup$
– Dimitrios Panagopoulos
Apr 18 at 18:04
add a comment |
$begingroup$
If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.
One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
Do modern R and/or Python libraries make SQL obsolete?
Hire someone who could do this.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49104%2fhow-should-we-manage-large-database%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.
But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.
Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.
Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.
$endgroup$
$begingroup$
Actually, any programming language may do this function.
$endgroup$
– Juan Esteban de la Calle
Apr 17 at 12:49
$begingroup$
True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
$endgroup$
– Dimitrios Panagopoulos
Apr 18 at 18:04
add a comment |
$begingroup$
As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.
But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.
Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.
Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.
$endgroup$
$begingroup$
Actually, any programming language may do this function.
$endgroup$
– Juan Esteban de la Calle
Apr 17 at 12:49
$begingroup$
True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
$endgroup$
– Dimitrios Panagopoulos
Apr 18 at 18:04
add a comment |
$begingroup$
As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.
But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.
Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.
Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.
$endgroup$
As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.
But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.
Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.
Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.
answered Apr 17 at 12:44
Dimitrios PanagopoulosDimitrios Panagopoulos
111
111
$begingroup$
Actually, any programming language may do this function.
$endgroup$
– Juan Esteban de la Calle
Apr 17 at 12:49
$begingroup$
True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
$endgroup$
– Dimitrios Panagopoulos
Apr 18 at 18:04
add a comment |
$begingroup$
Actually, any programming language may do this function.
$endgroup$
– Juan Esteban de la Calle
Apr 17 at 12:49
$begingroup$
True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
$endgroup$
– Dimitrios Panagopoulos
Apr 18 at 18:04
$begingroup$
Actually, any programming language may do this function.
$endgroup$
– Juan Esteban de la Calle
Apr 17 at 12:49
$begingroup$
Actually, any programming language may do this function.
$endgroup$
– Juan Esteban de la Calle
Apr 17 at 12:49
$begingroup$
True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
$endgroup$
– Dimitrios Panagopoulos
Apr 18 at 18:04
$begingroup$
True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
$endgroup$
– Dimitrios Panagopoulos
Apr 18 at 18:04
add a comment |
$begingroup$
If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.
One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
Do modern R and/or Python libraries make SQL obsolete?
Hire someone who could do this.
$endgroup$
add a comment |
$begingroup$
If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.
One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
Do modern R and/or Python libraries make SQL obsolete?
Hire someone who could do this.
$endgroup$
add a comment |
$begingroup$
If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.
One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
Do modern R and/or Python libraries make SQL obsolete?
Hire someone who could do this.
$endgroup$
If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.
One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
Do modern R and/or Python libraries make SQL obsolete?
Hire someone who could do this.
edited Apr 11 at 17:36
answered Apr 11 at 17:12
bonez001bonez001
714
714
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49104%2fhow-should-we-manage-large-database%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
You should consider using Apache Spark or Hadoop.
$endgroup$
– pythinker
Apr 11 at 13:40