Machine Learning vs Deep Learning Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsHow to do Machine Learning the right way?Future of deep learning (compared to traditional machine learning)Why is video classification still not that accurate?deep learning for non-image non-NLP tasks?What is the difference between data analysis and machine learning?Fast introduction to deep learning in Python, with advanced math and some machine learning backgrounds, but not much Python experienceHow to get a feature from sound/audio data learning using machine learning supervised classification?Is deduction, genetic programming, PCA, or clustering machine learning according to Tom Mitchells definition?What brings the performance difference in Deep Learning with different data augmentation strategies?

Novel: non-telepath helps overthrow rule by telepaths

Is it a good idea to use CNN to classify 1D signal?

Is it fair for a professor to grade us on the possession of past papers?

In predicate logic, does existential quantification (∃) include universal quantification (∀), i.e. can 'some' imply 'all'?

Dating a Former Employee

How to find all the available tools in mac terminal?

Should I use a zero-interest credit card for a large one-time purchase?

Use second argument for optional first argument if not provided in macro

Why didn't Eitri join the fight?

Is it true that "carbohydrates are of no use for the basal metabolic need"?

Tht Aain’t Right... #2

When a candle burns, why does the top of wick glow if bottom of flame is hottest?

Should I discuss the type of campaign with my players?

Sci-Fi book where patients in a coma ward all live in a subconscious world linked together

Fundamental Solution of the Pell Equation

What does an IRS interview request entail when called in to verify expenses for a sole proprietor small business?

Given a circle and line equations what are all possible solutions for k where the line sits tangent to the circle?

An adverb for when you're not exaggerating

How to convince students of the implication truth values?

What's the purpose of writing one's academic biography in the third person?

How much time will it take to get my passport back if I am applying for multiple Schengen visa countries?

Gordon Ramsay Pudding Recipe

What does the "x" in "x86" represent?

51k Euros annually for a family of 4 in Berlin: Is it enough?



Machine Learning vs Deep Learning



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsHow to do Machine Learning the right way?Future of deep learning (compared to traditional machine learning)Why is video classification still not that accurate?deep learning for non-image non-NLP tasks?What is the difference between data analysis and machine learning?Fast introduction to deep learning in Python, with advanced math and some machine learning backgrounds, but not much Python experienceHow to get a feature from sound/audio data learning using machine learning supervised classification?Is deduction, genetic programming, PCA, or clustering machine learning according to Tom Mitchells definition?What brings the performance difference in Deep Learning with different data augmentation strategies?










9












$begingroup$


I am a bit confused by the difference between the terms "Machine Learning" and "Deep Learning". I have Googled it and read many articles, but it is still not very clear to me.



A known definition of Machine Learning by Tom Mitchell is:




A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E.




If I take an image classification problem of classifying dogs and cats as my taks T, from this definition I understand that if I would give a ML algorithm a bunch of images of dogs and cats (experience E), the ML algorithm could learn how to distinguish a new image as being either a dog or cat (provided the performance measure P is well defined).



Then comes Deep Learning. I understand that Deep Learning is part of Machine Learning, and that the above definition holds. The performance at task T improves with experience E. All fine till now.



This blog states that there is a difference between Machine Learning and Deep Learning. The difference according to Adil is that in (Traditional) Machine Learning the features have to be hand-crafted, whereas in Deep Learning the features are learned. The following figures clarify his statement.



enter image description here



I am confused by the fact that in (Traditional) Machine Learning the features have to be hand-crafted. From the above definition by Tom Mitchell, I would think that these features would be learned from experience E and performance P. What could otherwise be learned in Machine Learning?



In Deep Learning I understand that from experience you learn the features and how they relate to each other to improve the performance. Could I conclude that in Machine Learning features have to be hand-crafted and what is learned is the combination of features? Or am I missing something else?










share|improve this question











$endgroup$







  • 1




    $begingroup$
    This is covered very well in the Deep Learning book by Goodfellow et al. in the first chapter (Introduction).
    $endgroup$
    – hbaderts
    Jan 20 '17 at 16:01
















9












$begingroup$


I am a bit confused by the difference between the terms "Machine Learning" and "Deep Learning". I have Googled it and read many articles, but it is still not very clear to me.



A known definition of Machine Learning by Tom Mitchell is:




A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E.




If I take an image classification problem of classifying dogs and cats as my taks T, from this definition I understand that if I would give a ML algorithm a bunch of images of dogs and cats (experience E), the ML algorithm could learn how to distinguish a new image as being either a dog or cat (provided the performance measure P is well defined).



Then comes Deep Learning. I understand that Deep Learning is part of Machine Learning, and that the above definition holds. The performance at task T improves with experience E. All fine till now.



This blog states that there is a difference between Machine Learning and Deep Learning. The difference according to Adil is that in (Traditional) Machine Learning the features have to be hand-crafted, whereas in Deep Learning the features are learned. The following figures clarify his statement.



enter image description here



I am confused by the fact that in (Traditional) Machine Learning the features have to be hand-crafted. From the above definition by Tom Mitchell, I would think that these features would be learned from experience E and performance P. What could otherwise be learned in Machine Learning?



In Deep Learning I understand that from experience you learn the features and how they relate to each other to improve the performance. Could I conclude that in Machine Learning features have to be hand-crafted and what is learned is the combination of features? Or am I missing something else?










share|improve this question











$endgroup$







  • 1




    $begingroup$
    This is covered very well in the Deep Learning book by Goodfellow et al. in the first chapter (Introduction).
    $endgroup$
    – hbaderts
    Jan 20 '17 at 16:01














9












9








9


5



$begingroup$


I am a bit confused by the difference between the terms "Machine Learning" and "Deep Learning". I have Googled it and read many articles, but it is still not very clear to me.



A known definition of Machine Learning by Tom Mitchell is:




A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E.




If I take an image classification problem of classifying dogs and cats as my taks T, from this definition I understand that if I would give a ML algorithm a bunch of images of dogs and cats (experience E), the ML algorithm could learn how to distinguish a new image as being either a dog or cat (provided the performance measure P is well defined).



Then comes Deep Learning. I understand that Deep Learning is part of Machine Learning, and that the above definition holds. The performance at task T improves with experience E. All fine till now.



This blog states that there is a difference between Machine Learning and Deep Learning. The difference according to Adil is that in (Traditional) Machine Learning the features have to be hand-crafted, whereas in Deep Learning the features are learned. The following figures clarify his statement.



enter image description here



I am confused by the fact that in (Traditional) Machine Learning the features have to be hand-crafted. From the above definition by Tom Mitchell, I would think that these features would be learned from experience E and performance P. What could otherwise be learned in Machine Learning?



In Deep Learning I understand that from experience you learn the features and how they relate to each other to improve the performance. Could I conclude that in Machine Learning features have to be hand-crafted and what is learned is the combination of features? Or am I missing something else?










share|improve this question











$endgroup$




I am a bit confused by the difference between the terms "Machine Learning" and "Deep Learning". I have Googled it and read many articles, but it is still not very clear to me.



A known definition of Machine Learning by Tom Mitchell is:




A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E.




If I take an image classification problem of classifying dogs and cats as my taks T, from this definition I understand that if I would give a ML algorithm a bunch of images of dogs and cats (experience E), the ML algorithm could learn how to distinguish a new image as being either a dog or cat (provided the performance measure P is well defined).



Then comes Deep Learning. I understand that Deep Learning is part of Machine Learning, and that the above definition holds. The performance at task T improves with experience E. All fine till now.



This blog states that there is a difference between Machine Learning and Deep Learning. The difference according to Adil is that in (Traditional) Machine Learning the features have to be hand-crafted, whereas in Deep Learning the features are learned. The following figures clarify his statement.



enter image description here



I am confused by the fact that in (Traditional) Machine Learning the features have to be hand-crafted. From the above definition by Tom Mitchell, I would think that these features would be learned from experience E and performance P. What could otherwise be learned in Machine Learning?



In Deep Learning I understand that from experience you learn the features and how they relate to each other to improve the performance. Could I conclude that in Machine Learning features have to be hand-crafted and what is learned is the combination of features? Or am I missing something else?







machine-learning deep-learning






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 23 '17 at 14:34







user2835098

















asked Jan 20 '17 at 10:45









user2835098user2835098

5315




5315







  • 1




    $begingroup$
    This is covered very well in the Deep Learning book by Goodfellow et al. in the first chapter (Introduction).
    $endgroup$
    – hbaderts
    Jan 20 '17 at 16:01













  • 1




    $begingroup$
    This is covered very well in the Deep Learning book by Goodfellow et al. in the first chapter (Introduction).
    $endgroup$
    – hbaderts
    Jan 20 '17 at 16:01








1




1




$begingroup$
This is covered very well in the Deep Learning book by Goodfellow et al. in the first chapter (Introduction).
$endgroup$
– hbaderts
Jan 20 '17 at 16:01





$begingroup$
This is covered very well in the Deep Learning book by Goodfellow et al. in the first chapter (Introduction).
$endgroup$
– hbaderts
Jan 20 '17 at 16:01











4 Answers
4






active

oldest

votes


















3












$begingroup$

In addition to what Himanshu Rai said, Deep learning is a subfield which involves the use of neural networks.These neural networks try to learn the underlying distribution by modifying the weights between the layers.
Now, consider the case of image recognition using deep learning:
a neural network model is divided among layers, these layers are connected by links called weights, as the training process begins, these layers adjust the weights such that each layer tries to detect some feature and help the next layer for its processing.The key point to note is we don't explicitly tell the layer to learn to detect edges, or eyes, nose or faces.The model learns to do that itself.Unlike classical machine learning models.






share|improve this answer









$endgroup$




















    6












    $begingroup$

    As a research area, Deep Learning is really just a sub-field of Machine Learning as Machine Learning is a sub-field of Artificial Intelligence.



    1) Unsupervised Feature Learning



    Conceptually, the first main difference between "traditional" (or "shallow") Machine Learning and Deep Learning is Unsupervised Feature Learning.



    As you already know, successfully training a "traditional" Machine Learning model (ex: SVM, XGBoost...) is only possible after suitable pre-processing and judicious feature extraction to select meaningful information from the data.
    That is, good feature vectors contain features distinctive between data points with different labels and consistent among data points with the same label.
    Feature Engineering is thus the process of manual feature selection from experts. This is a very important but tedious taks to perform!



    Unsupervised Feature Learning is a process where the model itself selects features automatically through training.
    The topology of a Neural Network organized in layers connected to each other have the nice property of mapping a low-level representation of the data to a higher-level representation. Through training, the network can thus "decide" what part of the data matters and what part of the data doesn't. This is particularly interesting in Computer Vision or Natural Language Processing where it is quite hard to manually select or engineer robust features.



    Unsupervised Feature Learning, credits: Tony Beltramelli(picture credits: Tony Beltramelli)



    As an example, let's assume we want to classify cat pictures. Using a Deep Neural Net, we can feed in the raw pixel values that will be mapped to a set of weights by the first layer, then these weights will be mapped to other weights by the second layer, until the last layer allows some weights to be mapped to numbers representing your problem. (ex: in this case the probability of the picture containing a cat)



    Even though Deep Neural Networks can perform Unsupervised Feature Learning, it doesn't prevent you from doing Feature Engineering yourself to better represent your problem. Unsupervised Feature Learning, Feature Extraction, and Feature Engineering are not mutually exclusive!



    Sources:



    • http://deeplearning.stanford.edu/tutorial/

    • https://arxiv.org/abs/1404.7828


    • https://arxiv.org/abs/1512.05616 (Chapter 2, Section 2)

    2) Linear Separability



    Deep Neural Networks can solve some non-linearly separable problems by bending the feature space such that features become linearly separable.
    Once again, this is possible thanks to the network topology organized in layers mapping inputs to new representations of the data.



    The hidden layer learns a representation so that the data is linearly separable, credits: Christopher Olah(picture credits: Christopher Olah)



    Sources:
    http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/



    3) Statistical Invariance



    Lastly, Deep Neural Networks are surpassing traditional Machine Learning algorithms in some domains because some architectures are showcasing Statistical Invariance (ex: Spacial Statistical Invariance with Convolutional Neural Networks and Temporal Statistical Invariance with Recurrent Neural Networks)



    Check this Udacity video for more details:
    https://www.youtube.com/watch?v=5PH2Vot-tD4






    share|improve this answer









    $endgroup$




















      2












      $begingroup$

      Inspired by Einstein, "“If you can't explain it to a six year old, you don't understand it yourself.”



      All of the above answers are very well explained but if one is looking for an easy to remember, abstract difference, here is the best one I know:




      The key difference is Machine Learning only digests data, while Deep Learning can generate and enhance data. It is not only predictive but also generative.




      Source. Of course there is much more to it but for beginners it can get way too confusing.






      share|improve this answer









      $endgroup$




















        -4












        $begingroup$

        Okay, think of it like this. In machine learning algirithms, such as linear regression or random forest you give the algorithms a set of features and the target and then it tries to minimize the cost function, so no it doesnt learn any new features, it just learns the weights. Now when you come to deep learning, you have atleast one, (almost always more) hidden layer with a set number of units, these are the features that are being talked about. So a deep learning algorithm doesnt just learn the sets of weights, in that process it also learns the values for hidden units which are complex high level features of the trivial data that you have given. Hence while practicing vanilla machine learning a lot of expertise lies in your ability to engineer features because the algorithm isnt learning any by itself. I hope I answered your question.






        share|improve this answer









        $endgroup$












        • $begingroup$
          Just another question: In for example CNNs, are the features (or filters) not the same thing as the weights?
          $endgroup$
          – user2835098
          Jan 20 '17 at 11:21










        • $begingroup$
          No, they are the weights for the convolution layer, but the product obtained out of the convolution i.e. the feature maps are the features.
          $endgroup$
          – Himanshu Rai
          Jan 20 '17 at 11:39










        • $begingroup$
          I disagree. Hidden variables are also present in random forest and boosting algorithms. And you still engineer features in deep learning. Like cropping area in one of the best image recognition algorithms in 2017
          $endgroup$
          – keiv.fly
          Oct 11 '18 at 22:45











        Your Answer








        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "557"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f16422%2fmachine-learning-vs-deep-learning%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        4 Answers
        4






        active

        oldest

        votes








        4 Answers
        4






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        3












        $begingroup$

        In addition to what Himanshu Rai said, Deep learning is a subfield which involves the use of neural networks.These neural networks try to learn the underlying distribution by modifying the weights between the layers.
        Now, consider the case of image recognition using deep learning:
        a neural network model is divided among layers, these layers are connected by links called weights, as the training process begins, these layers adjust the weights such that each layer tries to detect some feature and help the next layer for its processing.The key point to note is we don't explicitly tell the layer to learn to detect edges, or eyes, nose or faces.The model learns to do that itself.Unlike classical machine learning models.






        share|improve this answer









        $endgroup$

















          3












          $begingroup$

          In addition to what Himanshu Rai said, Deep learning is a subfield which involves the use of neural networks.These neural networks try to learn the underlying distribution by modifying the weights between the layers.
          Now, consider the case of image recognition using deep learning:
          a neural network model is divided among layers, these layers are connected by links called weights, as the training process begins, these layers adjust the weights such that each layer tries to detect some feature and help the next layer for its processing.The key point to note is we don't explicitly tell the layer to learn to detect edges, or eyes, nose or faces.The model learns to do that itself.Unlike classical machine learning models.






          share|improve this answer









          $endgroup$















            3












            3








            3





            $begingroup$

            In addition to what Himanshu Rai said, Deep learning is a subfield which involves the use of neural networks.These neural networks try to learn the underlying distribution by modifying the weights between the layers.
            Now, consider the case of image recognition using deep learning:
            a neural network model is divided among layers, these layers are connected by links called weights, as the training process begins, these layers adjust the weights such that each layer tries to detect some feature and help the next layer for its processing.The key point to note is we don't explicitly tell the layer to learn to detect edges, or eyes, nose or faces.The model learns to do that itself.Unlike classical machine learning models.






            share|improve this answer









            $endgroup$



            In addition to what Himanshu Rai said, Deep learning is a subfield which involves the use of neural networks.These neural networks try to learn the underlying distribution by modifying the weights between the layers.
            Now, consider the case of image recognition using deep learning:
            a neural network model is divided among layers, these layers are connected by links called weights, as the training process begins, these layers adjust the weights such that each layer tries to detect some feature and help the next layer for its processing.The key point to note is we don't explicitly tell the layer to learn to detect edges, or eyes, nose or faces.The model learns to do that itself.Unlike classical machine learning models.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Jan 20 '17 at 13:59









            AvhirupAvhirup

            882




            882





















                6












                $begingroup$

                As a research area, Deep Learning is really just a sub-field of Machine Learning as Machine Learning is a sub-field of Artificial Intelligence.



                1) Unsupervised Feature Learning



                Conceptually, the first main difference between "traditional" (or "shallow") Machine Learning and Deep Learning is Unsupervised Feature Learning.



                As you already know, successfully training a "traditional" Machine Learning model (ex: SVM, XGBoost...) is only possible after suitable pre-processing and judicious feature extraction to select meaningful information from the data.
                That is, good feature vectors contain features distinctive between data points with different labels and consistent among data points with the same label.
                Feature Engineering is thus the process of manual feature selection from experts. This is a very important but tedious taks to perform!



                Unsupervised Feature Learning is a process where the model itself selects features automatically through training.
                The topology of a Neural Network organized in layers connected to each other have the nice property of mapping a low-level representation of the data to a higher-level representation. Through training, the network can thus "decide" what part of the data matters and what part of the data doesn't. This is particularly interesting in Computer Vision or Natural Language Processing where it is quite hard to manually select or engineer robust features.



                Unsupervised Feature Learning, credits: Tony Beltramelli(picture credits: Tony Beltramelli)



                As an example, let's assume we want to classify cat pictures. Using a Deep Neural Net, we can feed in the raw pixel values that will be mapped to a set of weights by the first layer, then these weights will be mapped to other weights by the second layer, until the last layer allows some weights to be mapped to numbers representing your problem. (ex: in this case the probability of the picture containing a cat)



                Even though Deep Neural Networks can perform Unsupervised Feature Learning, it doesn't prevent you from doing Feature Engineering yourself to better represent your problem. Unsupervised Feature Learning, Feature Extraction, and Feature Engineering are not mutually exclusive!



                Sources:



                • http://deeplearning.stanford.edu/tutorial/

                • https://arxiv.org/abs/1404.7828


                • https://arxiv.org/abs/1512.05616 (Chapter 2, Section 2)

                2) Linear Separability



                Deep Neural Networks can solve some non-linearly separable problems by bending the feature space such that features become linearly separable.
                Once again, this is possible thanks to the network topology organized in layers mapping inputs to new representations of the data.



                The hidden layer learns a representation so that the data is linearly separable, credits: Christopher Olah(picture credits: Christopher Olah)



                Sources:
                http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/



                3) Statistical Invariance



                Lastly, Deep Neural Networks are surpassing traditional Machine Learning algorithms in some domains because some architectures are showcasing Statistical Invariance (ex: Spacial Statistical Invariance with Convolutional Neural Networks and Temporal Statistical Invariance with Recurrent Neural Networks)



                Check this Udacity video for more details:
                https://www.youtube.com/watch?v=5PH2Vot-tD4






                share|improve this answer









                $endgroup$

















                  6












                  $begingroup$

                  As a research area, Deep Learning is really just a sub-field of Machine Learning as Machine Learning is a sub-field of Artificial Intelligence.



                  1) Unsupervised Feature Learning



                  Conceptually, the first main difference between "traditional" (or "shallow") Machine Learning and Deep Learning is Unsupervised Feature Learning.



                  As you already know, successfully training a "traditional" Machine Learning model (ex: SVM, XGBoost...) is only possible after suitable pre-processing and judicious feature extraction to select meaningful information from the data.
                  That is, good feature vectors contain features distinctive between data points with different labels and consistent among data points with the same label.
                  Feature Engineering is thus the process of manual feature selection from experts. This is a very important but tedious taks to perform!



                  Unsupervised Feature Learning is a process where the model itself selects features automatically through training.
                  The topology of a Neural Network organized in layers connected to each other have the nice property of mapping a low-level representation of the data to a higher-level representation. Through training, the network can thus "decide" what part of the data matters and what part of the data doesn't. This is particularly interesting in Computer Vision or Natural Language Processing where it is quite hard to manually select or engineer robust features.



                  Unsupervised Feature Learning, credits: Tony Beltramelli(picture credits: Tony Beltramelli)



                  As an example, let's assume we want to classify cat pictures. Using a Deep Neural Net, we can feed in the raw pixel values that will be mapped to a set of weights by the first layer, then these weights will be mapped to other weights by the second layer, until the last layer allows some weights to be mapped to numbers representing your problem. (ex: in this case the probability of the picture containing a cat)



                  Even though Deep Neural Networks can perform Unsupervised Feature Learning, it doesn't prevent you from doing Feature Engineering yourself to better represent your problem. Unsupervised Feature Learning, Feature Extraction, and Feature Engineering are not mutually exclusive!



                  Sources:



                  • http://deeplearning.stanford.edu/tutorial/

                  • https://arxiv.org/abs/1404.7828


                  • https://arxiv.org/abs/1512.05616 (Chapter 2, Section 2)

                  2) Linear Separability



                  Deep Neural Networks can solve some non-linearly separable problems by bending the feature space such that features become linearly separable.
                  Once again, this is possible thanks to the network topology organized in layers mapping inputs to new representations of the data.



                  The hidden layer learns a representation so that the data is linearly separable, credits: Christopher Olah(picture credits: Christopher Olah)



                  Sources:
                  http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/



                  3) Statistical Invariance



                  Lastly, Deep Neural Networks are surpassing traditional Machine Learning algorithms in some domains because some architectures are showcasing Statistical Invariance (ex: Spacial Statistical Invariance with Convolutional Neural Networks and Temporal Statistical Invariance with Recurrent Neural Networks)



                  Check this Udacity video for more details:
                  https://www.youtube.com/watch?v=5PH2Vot-tD4






                  share|improve this answer









                  $endgroup$















                    6












                    6








                    6





                    $begingroup$

                    As a research area, Deep Learning is really just a sub-field of Machine Learning as Machine Learning is a sub-field of Artificial Intelligence.



                    1) Unsupervised Feature Learning



                    Conceptually, the first main difference between "traditional" (or "shallow") Machine Learning and Deep Learning is Unsupervised Feature Learning.



                    As you already know, successfully training a "traditional" Machine Learning model (ex: SVM, XGBoost...) is only possible after suitable pre-processing and judicious feature extraction to select meaningful information from the data.
                    That is, good feature vectors contain features distinctive between data points with different labels and consistent among data points with the same label.
                    Feature Engineering is thus the process of manual feature selection from experts. This is a very important but tedious taks to perform!



                    Unsupervised Feature Learning is a process where the model itself selects features automatically through training.
                    The topology of a Neural Network organized in layers connected to each other have the nice property of mapping a low-level representation of the data to a higher-level representation. Through training, the network can thus "decide" what part of the data matters and what part of the data doesn't. This is particularly interesting in Computer Vision or Natural Language Processing where it is quite hard to manually select or engineer robust features.



                    Unsupervised Feature Learning, credits: Tony Beltramelli(picture credits: Tony Beltramelli)



                    As an example, let's assume we want to classify cat pictures. Using a Deep Neural Net, we can feed in the raw pixel values that will be mapped to a set of weights by the first layer, then these weights will be mapped to other weights by the second layer, until the last layer allows some weights to be mapped to numbers representing your problem. (ex: in this case the probability of the picture containing a cat)



                    Even though Deep Neural Networks can perform Unsupervised Feature Learning, it doesn't prevent you from doing Feature Engineering yourself to better represent your problem. Unsupervised Feature Learning, Feature Extraction, and Feature Engineering are not mutually exclusive!



                    Sources:



                    • http://deeplearning.stanford.edu/tutorial/

                    • https://arxiv.org/abs/1404.7828


                    • https://arxiv.org/abs/1512.05616 (Chapter 2, Section 2)

                    2) Linear Separability



                    Deep Neural Networks can solve some non-linearly separable problems by bending the feature space such that features become linearly separable.
                    Once again, this is possible thanks to the network topology organized in layers mapping inputs to new representations of the data.



                    The hidden layer learns a representation so that the data is linearly separable, credits: Christopher Olah(picture credits: Christopher Olah)



                    Sources:
                    http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/



                    3) Statistical Invariance



                    Lastly, Deep Neural Networks are surpassing traditional Machine Learning algorithms in some domains because some architectures are showcasing Statistical Invariance (ex: Spacial Statistical Invariance with Convolutional Neural Networks and Temporal Statistical Invariance with Recurrent Neural Networks)



                    Check this Udacity video for more details:
                    https://www.youtube.com/watch?v=5PH2Vot-tD4






                    share|improve this answer









                    $endgroup$



                    As a research area, Deep Learning is really just a sub-field of Machine Learning as Machine Learning is a sub-field of Artificial Intelligence.



                    1) Unsupervised Feature Learning



                    Conceptually, the first main difference between "traditional" (or "shallow") Machine Learning and Deep Learning is Unsupervised Feature Learning.



                    As you already know, successfully training a "traditional" Machine Learning model (ex: SVM, XGBoost...) is only possible after suitable pre-processing and judicious feature extraction to select meaningful information from the data.
                    That is, good feature vectors contain features distinctive between data points with different labels and consistent among data points with the same label.
                    Feature Engineering is thus the process of manual feature selection from experts. This is a very important but tedious taks to perform!



                    Unsupervised Feature Learning is a process where the model itself selects features automatically through training.
                    The topology of a Neural Network organized in layers connected to each other have the nice property of mapping a low-level representation of the data to a higher-level representation. Through training, the network can thus "decide" what part of the data matters and what part of the data doesn't. This is particularly interesting in Computer Vision or Natural Language Processing where it is quite hard to manually select or engineer robust features.



                    Unsupervised Feature Learning, credits: Tony Beltramelli(picture credits: Tony Beltramelli)



                    As an example, let's assume we want to classify cat pictures. Using a Deep Neural Net, we can feed in the raw pixel values that will be mapped to a set of weights by the first layer, then these weights will be mapped to other weights by the second layer, until the last layer allows some weights to be mapped to numbers representing your problem. (ex: in this case the probability of the picture containing a cat)



                    Even though Deep Neural Networks can perform Unsupervised Feature Learning, it doesn't prevent you from doing Feature Engineering yourself to better represent your problem. Unsupervised Feature Learning, Feature Extraction, and Feature Engineering are not mutually exclusive!



                    Sources:



                    • http://deeplearning.stanford.edu/tutorial/

                    • https://arxiv.org/abs/1404.7828


                    • https://arxiv.org/abs/1512.05616 (Chapter 2, Section 2)

                    2) Linear Separability



                    Deep Neural Networks can solve some non-linearly separable problems by bending the feature space such that features become linearly separable.
                    Once again, this is possible thanks to the network topology organized in layers mapping inputs to new representations of the data.



                    The hidden layer learns a representation so that the data is linearly separable, credits: Christopher Olah(picture credits: Christopher Olah)



                    Sources:
                    http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/



                    3) Statistical Invariance



                    Lastly, Deep Neural Networks are surpassing traditional Machine Learning algorithms in some domains because some architectures are showcasing Statistical Invariance (ex: Spacial Statistical Invariance with Convolutional Neural Networks and Temporal Statistical Invariance with Recurrent Neural Networks)



                    Check this Udacity video for more details:
                    https://www.youtube.com/watch?v=5PH2Vot-tD4







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Jan 23 '17 at 13:05









                    tonytony

                    56125




                    56125





















                        2












                        $begingroup$

                        Inspired by Einstein, "“If you can't explain it to a six year old, you don't understand it yourself.”



                        All of the above answers are very well explained but if one is looking for an easy to remember, abstract difference, here is the best one I know:




                        The key difference is Machine Learning only digests data, while Deep Learning can generate and enhance data. It is not only predictive but also generative.




                        Source. Of course there is much more to it but for beginners it can get way too confusing.






                        share|improve this answer









                        $endgroup$

















                          2












                          $begingroup$

                          Inspired by Einstein, "“If you can't explain it to a six year old, you don't understand it yourself.”



                          All of the above answers are very well explained but if one is looking for an easy to remember, abstract difference, here is the best one I know:




                          The key difference is Machine Learning only digests data, while Deep Learning can generate and enhance data. It is not only predictive but also generative.




                          Source. Of course there is much more to it but for beginners it can get way too confusing.






                          share|improve this answer









                          $endgroup$















                            2












                            2








                            2





                            $begingroup$

                            Inspired by Einstein, "“If you can't explain it to a six year old, you don't understand it yourself.”



                            All of the above answers are very well explained but if one is looking for an easy to remember, abstract difference, here is the best one I know:




                            The key difference is Machine Learning only digests data, while Deep Learning can generate and enhance data. It is not only predictive but also generative.




                            Source. Of course there is much more to it but for beginners it can get way too confusing.






                            share|improve this answer









                            $endgroup$



                            Inspired by Einstein, "“If you can't explain it to a six year old, you don't understand it yourself.”



                            All of the above answers are very well explained but if one is looking for an easy to remember, abstract difference, here is the best one I know:




                            The key difference is Machine Learning only digests data, while Deep Learning can generate and enhance data. It is not only predictive but also generative.




                            Source. Of course there is much more to it but for beginners it can get way too confusing.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Oct 30 '17 at 6:05









                            ZenVentziZenVentzi

                            1291




                            1291





















                                -4












                                $begingroup$

                                Okay, think of it like this. In machine learning algirithms, such as linear regression or random forest you give the algorithms a set of features and the target and then it tries to minimize the cost function, so no it doesnt learn any new features, it just learns the weights. Now when you come to deep learning, you have atleast one, (almost always more) hidden layer with a set number of units, these are the features that are being talked about. So a deep learning algorithm doesnt just learn the sets of weights, in that process it also learns the values for hidden units which are complex high level features of the trivial data that you have given. Hence while practicing vanilla machine learning a lot of expertise lies in your ability to engineer features because the algorithm isnt learning any by itself. I hope I answered your question.






                                share|improve this answer









                                $endgroup$












                                • $begingroup$
                                  Just another question: In for example CNNs, are the features (or filters) not the same thing as the weights?
                                  $endgroup$
                                  – user2835098
                                  Jan 20 '17 at 11:21










                                • $begingroup$
                                  No, they are the weights for the convolution layer, but the product obtained out of the convolution i.e. the feature maps are the features.
                                  $endgroup$
                                  – Himanshu Rai
                                  Jan 20 '17 at 11:39










                                • $begingroup$
                                  I disagree. Hidden variables are also present in random forest and boosting algorithms. And you still engineer features in deep learning. Like cropping area in one of the best image recognition algorithms in 2017
                                  $endgroup$
                                  – keiv.fly
                                  Oct 11 '18 at 22:45















                                -4












                                $begingroup$

                                Okay, think of it like this. In machine learning algirithms, such as linear regression or random forest you give the algorithms a set of features and the target and then it tries to minimize the cost function, so no it doesnt learn any new features, it just learns the weights. Now when you come to deep learning, you have atleast one, (almost always more) hidden layer with a set number of units, these are the features that are being talked about. So a deep learning algorithm doesnt just learn the sets of weights, in that process it also learns the values for hidden units which are complex high level features of the trivial data that you have given. Hence while practicing vanilla machine learning a lot of expertise lies in your ability to engineer features because the algorithm isnt learning any by itself. I hope I answered your question.






                                share|improve this answer









                                $endgroup$












                                • $begingroup$
                                  Just another question: In for example CNNs, are the features (or filters) not the same thing as the weights?
                                  $endgroup$
                                  – user2835098
                                  Jan 20 '17 at 11:21










                                • $begingroup$
                                  No, they are the weights for the convolution layer, but the product obtained out of the convolution i.e. the feature maps are the features.
                                  $endgroup$
                                  – Himanshu Rai
                                  Jan 20 '17 at 11:39










                                • $begingroup$
                                  I disagree. Hidden variables are also present in random forest and boosting algorithms. And you still engineer features in deep learning. Like cropping area in one of the best image recognition algorithms in 2017
                                  $endgroup$
                                  – keiv.fly
                                  Oct 11 '18 at 22:45













                                -4












                                -4








                                -4





                                $begingroup$

                                Okay, think of it like this. In machine learning algirithms, such as linear regression or random forest you give the algorithms a set of features and the target and then it tries to minimize the cost function, so no it doesnt learn any new features, it just learns the weights. Now when you come to deep learning, you have atleast one, (almost always more) hidden layer with a set number of units, these are the features that are being talked about. So a deep learning algorithm doesnt just learn the sets of weights, in that process it also learns the values for hidden units which are complex high level features of the trivial data that you have given. Hence while practicing vanilla machine learning a lot of expertise lies in your ability to engineer features because the algorithm isnt learning any by itself. I hope I answered your question.






                                share|improve this answer









                                $endgroup$



                                Okay, think of it like this. In machine learning algirithms, such as linear regression or random forest you give the algorithms a set of features and the target and then it tries to minimize the cost function, so no it doesnt learn any new features, it just learns the weights. Now when you come to deep learning, you have atleast one, (almost always more) hidden layer with a set number of units, these are the features that are being talked about. So a deep learning algorithm doesnt just learn the sets of weights, in that process it also learns the values for hidden units which are complex high level features of the trivial data that you have given. Hence while practicing vanilla machine learning a lot of expertise lies in your ability to engineer features because the algorithm isnt learning any by itself. I hope I answered your question.







                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered Jan 20 '17 at 11:15









                                Himanshu RaiHimanshu Rai

                                1,32748




                                1,32748











                                • $begingroup$
                                  Just another question: In for example CNNs, are the features (or filters) not the same thing as the weights?
                                  $endgroup$
                                  – user2835098
                                  Jan 20 '17 at 11:21










                                • $begingroup$
                                  No, they are the weights for the convolution layer, but the product obtained out of the convolution i.e. the feature maps are the features.
                                  $endgroup$
                                  – Himanshu Rai
                                  Jan 20 '17 at 11:39










                                • $begingroup$
                                  I disagree. Hidden variables are also present in random forest and boosting algorithms. And you still engineer features in deep learning. Like cropping area in one of the best image recognition algorithms in 2017
                                  $endgroup$
                                  – keiv.fly
                                  Oct 11 '18 at 22:45
















                                • $begingroup$
                                  Just another question: In for example CNNs, are the features (or filters) not the same thing as the weights?
                                  $endgroup$
                                  – user2835098
                                  Jan 20 '17 at 11:21










                                • $begingroup$
                                  No, they are the weights for the convolution layer, but the product obtained out of the convolution i.e. the feature maps are the features.
                                  $endgroup$
                                  – Himanshu Rai
                                  Jan 20 '17 at 11:39










                                • $begingroup$
                                  I disagree. Hidden variables are also present in random forest and boosting algorithms. And you still engineer features in deep learning. Like cropping area in one of the best image recognition algorithms in 2017
                                  $endgroup$
                                  – keiv.fly
                                  Oct 11 '18 at 22:45















                                $begingroup$
                                Just another question: In for example CNNs, are the features (or filters) not the same thing as the weights?
                                $endgroup$
                                – user2835098
                                Jan 20 '17 at 11:21




                                $begingroup$
                                Just another question: In for example CNNs, are the features (or filters) not the same thing as the weights?
                                $endgroup$
                                – user2835098
                                Jan 20 '17 at 11:21












                                $begingroup$
                                No, they are the weights for the convolution layer, but the product obtained out of the convolution i.e. the feature maps are the features.
                                $endgroup$
                                – Himanshu Rai
                                Jan 20 '17 at 11:39




                                $begingroup$
                                No, they are the weights for the convolution layer, but the product obtained out of the convolution i.e. the feature maps are the features.
                                $endgroup$
                                – Himanshu Rai
                                Jan 20 '17 at 11:39












                                $begingroup$
                                I disagree. Hidden variables are also present in random forest and boosting algorithms. And you still engineer features in deep learning. Like cropping area in one of the best image recognition algorithms in 2017
                                $endgroup$
                                – keiv.fly
                                Oct 11 '18 at 22:45




                                $begingroup$
                                I disagree. Hidden variables are also present in random forest and boosting algorithms. And you still engineer features in deep learning. Like cropping area in one of the best image recognition algorithms in 2017
                                $endgroup$
                                – keiv.fly
                                Oct 11 '18 at 22:45

















                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Data Science Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                Use MathJax to format equations. MathJax reference.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f16422%2fmachine-learning-vs-deep-learning%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                                Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

                                Do these cracks on my tires look bad? The Next CEO of Stack OverflowDry rot tire should I replace?Having to replace tiresFishtailed so easily? Bad tires? ABS?Filling the tires with something other than air, to avoid puncture hassles?Used Michelin tires safe to install?Do these tyre cracks necessitate replacement?Rumbling noise: tires or mechanicalIs it possible to fix noisy feathered tires?Are bad winter tires still better than summer tires in winter?Torque converter failure - Related to replacing only 2 tires?Why use snow tires on all 4 wheels on 2-wheel-drive cars?