How should we manage large databaseDo modern R and/or Python libraries make SQL obsolete?Manage x-axis using ggplot()Decode SQLite database blobs: how to start?Analytics oriented databaseHow to document operations on a database?Seeking advice on database architecture — given my problem, what tools should I learn?Database for storing nested s3 objects from broom, sweep, or tidy in RDatabase System for Manual EntryMachine Learning applied to database designGraph database using networkxConnect Orange 3.20 to postgresql database

Should one double the thirds or the fifth in chords?

A non-technological, repeating, phenomenon in the sky, holding its position in the sky for hours

Why do money exchangers give different rates to different bills?

How is the law in a case of multiple edim zomemim justified by Chachomim?

Why is C# in the D Major Scale?

I need a disease

CRT Oscilloscope - part of the plot is missing

Has any spacecraft ever had the ability to directly communicate with civilian air traffic control?

How to get a product new from and to date in phtml file in magento 2

Can the 歳 counter be used for architecture, furniture etc to tell its age?

Has a commercial or military jet bi-plane ever been manufactured?

What are the spoon bit of a spoon and fork bit of a fork called?

How to reply this mail from potential PhD professor?

Enumerate Derangements

Missed the connecting flight, separate tickets on same airline - who is responsible?

Pressure inside an infinite ocean?

In a vacuum triode, what prevents the grid from acting as another anode?

Can't remove one character of space in my environment

If a prion is a protein, why is it not disassembled by the digestive system?

Short story with physics professor who "brings back the dead" (Asimov or Bradbury?)

SQL Server Management Studio SSMS 18.0 General Availability release (GA) install fails

Besides the up and down quark, what other quarks are present in daily matter around us?

Airbnb - host wants to reduce rooms, can we get refund?

If Earth is tilted, why is Polaris always above the same spot?



How should we manage large database


Do modern R and/or Python libraries make SQL obsolete?Manage x-axis using ggplot()Decode SQLite database blobs: how to start?Analytics oriented databaseHow to document operations on a database?Seeking advice on database architecture — given my problem, what tools should I learn?Database for storing nested s3 objects from broom, sweep, or tidy in RDatabase System for Manual EntryMachine Learning applied to database designGraph database using networkxConnect Orange 3.20 to postgresql database













0












$begingroup$


We currently manage a large volume of economic data in Excel in my organisation. All of the data is downloaded from different online databases into Excel spreadsheets (one for each data frequency including annual, monthly, quarterly) - and then one main spreadsheet organises everything and creates tables that we need regularly. By organise, I mean that many of the things we need are simply identities ($Z=X+Y$ where we would have only downloaded data on $X$ and $Y$)



My view is that this could be done much more efficiently in R - where we'd automate the updating of the data and then spit out the tables that we need. But I am not trained at all in data management.



Would you all recommend a better way of doing this, or are there pitfalls to using R that I am not considering.










share|improve this question











$endgroup$











  • $begingroup$
    You should consider using Apache Spark or Hadoop.
    $endgroup$
    – pythinker
    Apr 11 at 13:40















0












$begingroup$


We currently manage a large volume of economic data in Excel in my organisation. All of the data is downloaded from different online databases into Excel spreadsheets (one for each data frequency including annual, monthly, quarterly) - and then one main spreadsheet organises everything and creates tables that we need regularly. By organise, I mean that many of the things we need are simply identities ($Z=X+Y$ where we would have only downloaded data on $X$ and $Y$)



My view is that this could be done much more efficiently in R - where we'd automate the updating of the data and then spit out the tables that we need. But I am not trained at all in data management.



Would you all recommend a better way of doing this, or are there pitfalls to using R that I am not considering.










share|improve this question











$endgroup$











  • $begingroup$
    You should consider using Apache Spark or Hadoop.
    $endgroup$
    – pythinker
    Apr 11 at 13:40













0












0








0





$begingroup$


We currently manage a large volume of economic data in Excel in my organisation. All of the data is downloaded from different online databases into Excel spreadsheets (one for each data frequency including annual, monthly, quarterly) - and then one main spreadsheet organises everything and creates tables that we need regularly. By organise, I mean that many of the things we need are simply identities ($Z=X+Y$ where we would have only downloaded data on $X$ and $Y$)



My view is that this could be done much more efficiently in R - where we'd automate the updating of the data and then spit out the tables that we need. But I am not trained at all in data management.



Would you all recommend a better way of doing this, or are there pitfalls to using R that I am not considering.










share|improve this question











$endgroup$




We currently manage a large volume of economic data in Excel in my organisation. All of the data is downloaded from different online databases into Excel spreadsheets (one for each data frequency including annual, monthly, quarterly) - and then one main spreadsheet organises everything and creates tables that we need regularly. By organise, I mean that many of the things we need are simply identities ($Z=X+Y$ where we would have only downloaded data on $X$ and $Y$)



My view is that this could be done much more efficiently in R - where we'd automate the updating of the data and then spit out the tables that we need. But I am not trained at all in data management.



Would you all recommend a better way of doing this, or are there pitfalls to using R that I am not considering.







r databases excel






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Apr 11 at 13:09









Stephen Rauch

1,53551330




1,53551330










asked Apr 11 at 7:59









StephenBStephenB

11




11











  • $begingroup$
    You should consider using Apache Spark or Hadoop.
    $endgroup$
    – pythinker
    Apr 11 at 13:40
















  • $begingroup$
    You should consider using Apache Spark or Hadoop.
    $endgroup$
    – pythinker
    Apr 11 at 13:40















$begingroup$
You should consider using Apache Spark or Hadoop.
$endgroup$
– pythinker
Apr 11 at 13:40




$begingroup$
You should consider using Apache Spark or Hadoop.
$endgroup$
– pythinker
Apr 11 at 13:40










2 Answers
2






active

oldest

votes


















1












$begingroup$

As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.



But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.



Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.



Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.






share|improve this answer









$endgroup$












  • $begingroup$
    Actually, any programming language may do this function.
    $endgroup$
    – Juan Esteban de la Calle
    Apr 17 at 12:49










  • $begingroup$
    True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
    $endgroup$
    – Dimitrios Panagopoulos
    Apr 18 at 18:04



















0












$begingroup$

If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.



One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
Do modern R and/or Python libraries make SQL obsolete?



Hire someone who could do this.






share|improve this answer











$endgroup$













    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "557"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49104%2fhow-should-we-manage-large-database%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1












    $begingroup$

    As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.



    But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.



    Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.



    Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.






    share|improve this answer









    $endgroup$












    • $begingroup$
      Actually, any programming language may do this function.
      $endgroup$
      – Juan Esteban de la Calle
      Apr 17 at 12:49










    • $begingroup$
      True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
      $endgroup$
      – Dimitrios Panagopoulos
      Apr 18 at 18:04
















    1












    $begingroup$

    As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.



    But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.



    Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.



    Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.






    share|improve this answer









    $endgroup$












    • $begingroup$
      Actually, any programming language may do this function.
      $endgroup$
      – Juan Esteban de la Calle
      Apr 17 at 12:49










    • $begingroup$
      True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
      $endgroup$
      – Dimitrios Panagopoulos
      Apr 18 at 18:04














    1












    1








    1





    $begingroup$

    As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.



    But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.



    Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.



    Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.






    share|improve this answer









    $endgroup$



    As mentioned in another answer, you should consider using a database. In the long run it will make your life easier.



    But, R can help you automate things. You can store excel files in specific folders and/or with specific names (or patterns of names ex. file1_day1, file1_day2 etc) and then create R scripts that process them and produce the report you want.



    Since you are using excel files (nothing personal with excel, it is a great program) I am inclined to think that your data will fit in a decent computer running R. In any case, if you end up using R, you should check out data.table package.



    Furthermore, R can help you create more complicate reports. Plus, its learning path is not too steep.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Apr 17 at 12:44









    Dimitrios PanagopoulosDimitrios Panagopoulos

    111




    111











    • $begingroup$
      Actually, any programming language may do this function.
      $endgroup$
      – Juan Esteban de la Calle
      Apr 17 at 12:49










    • $begingroup$
      True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
      $endgroup$
      – Dimitrios Panagopoulos
      Apr 18 at 18:04

















    • $begingroup$
      Actually, any programming language may do this function.
      $endgroup$
      – Juan Esteban de la Calle
      Apr 17 at 12:49










    • $begingroup$
      True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
      $endgroup$
      – Dimitrios Panagopoulos
      Apr 18 at 18:04
















    $begingroup$
    Actually, any programming language may do this function.
    $endgroup$
    – Juan Esteban de la Calle
    Apr 17 at 12:49




    $begingroup$
    Actually, any programming language may do this function.
    $endgroup$
    – Juan Esteban de la Calle
    Apr 17 at 12:49












    $begingroup$
    True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
    $endgroup$
    – Dimitrios Panagopoulos
    Apr 18 at 18:04





    $begingroup$
    True. But R was named and secondly, R is quite popular with tons of support and easy to learn (at least in my opinion). Of course there exist other options.
    $endgroup$
    – Dimitrios Panagopoulos
    Apr 18 at 18:04












    0












    $begingroup$

    If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.



    One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
    Do modern R and/or Python libraries make SQL obsolete?



    Hire someone who could do this.






    share|improve this answer











    $endgroup$

















      0












      $begingroup$

      If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.



      One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
      Do modern R and/or Python libraries make SQL obsolete?



      Hire someone who could do this.






      share|improve this answer











      $endgroup$















        0












        0








        0





        $begingroup$

        If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.



        One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
        Do modern R and/or Python libraries make SQL obsolete?



        Hire someone who could do this.






        share|improve this answer











        $endgroup$



        If it is not that big, use PostgreSQL/MySQL/SQL Server. Don't use R. R is suited for complicated computations and visualizations.



        One way or another, you and your colleagues will search and update your data and databases are good on that. See this post for more details:
        Do modern R and/or Python libraries make SQL obsolete?



        Hire someone who could do this.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Apr 11 at 17:36

























        answered Apr 11 at 17:12









        bonez001bonez001

        714




        714



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49104%2fhow-should-we-manage-large-database%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

            Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

            Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High