Can video descriptions be learned from videos + human descriptions? [on hold]Attributes extraction from unstructured product descriptionsCan the size of a pooling layer be learned?Boolean classification on stringsWhich machine (or deep) learning methods could suit my text classification problem?Machine learning system which can learn from data and human rulesTraining an AI to play Starcraft 2 with superhuman level of performance?Can I analyze Video in AI?Video classification of birdsMachine Learning and Natural Language Processing : Project InitiationUsing ontology to infer labels for process model

How do you respond to a colleague from another team when they're wrongly expecting that you'll help them?

How should I respond when I lied about my education and the company finds out through background check?

By means of an example, show that P(A) + P(B) = 1 does not mean that B is the complement of A.

Why is so much work done on numerical verification of the Riemann Hypothesis?

Is it better practice to read straight from sheet music rather than memorize it?

why `nmap 192.168.1.97` returns less services than `nmap 127.0.0.1`?

Why does the Sun have different day lengths, but not the gas giants?

Can I use Seifert-van Kampen theorem infinite times

Varistor? Purpose and principle

Should I stop contributing to retirement accounts?

Why are synthetic pH indicators used over natural indicators?

Need a math help for the Cagan's model in macroeconomics

Open a doc from terminal, but not by its name

What should you do when eye contact makes your subordinate uncomfortable?

MTG Artifact and Enchantment Rulings

How can Trident be so inexpensive? Will it orbit Triton or just do a (slow) flyby?

Is there an efficient solution to the travelling salesman problem with binary edge weights?

Will the technology I first learn determine the direction of my future career?

Redundant comparison & "if" before assignment

Store Credit Card Information in Password Manager?

Longest common substring in linear time

Why did the HMS Bounty go back to a time when whales are already rare?

Loading commands from file

Where did Heinlein say "Once you get to Earth orbit, you're halfway to anywhere in the Solar System"?



Can video descriptions be learned from videos + human descriptions? [on hold]


Attributes extraction from unstructured product descriptionsCan the size of a pooling layer be learned?Boolean classification on stringsWhich machine (or deep) learning methods could suit my text classification problem?Machine learning system which can learn from data and human rulesTraining an AI to play Starcraft 2 with superhuman level of performance?Can I analyze Video in AI?Video classification of birdsMachine Learning and Natural Language Processing : Project InitiationUsing ontology to infer labels for process model













0












$begingroup$


So I have a dataset of short videos with simple content and humans describing them, for example ("ball is pushed"; "cube is grabbed"; "book is grabbed"; etc.).
What strategies are there in order to learn to describe a new video (like "cube is pushed").
I have no concrete classes I want to classify the videos in but need to generate a natural language sentence.



Do you know some exemplary research I could study or what techniques I would have to use?










share|improve this question







New contributor




arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$



put on hold as too broad by Spacedman, Ethan, Siong Thye Goh, Toros91, Mark.F Mar 20 at 8:20


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






















    0












    $begingroup$


    So I have a dataset of short videos with simple content and humans describing them, for example ("ball is pushed"; "cube is grabbed"; "book is grabbed"; etc.).
    What strategies are there in order to learn to describe a new video (like "cube is pushed").
    I have no concrete classes I want to classify the videos in but need to generate a natural language sentence.



    Do you know some exemplary research I could study or what techniques I would have to use?










    share|improve this question







    New contributor




    arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$



    put on hold as too broad by Spacedman, Ethan, Siong Thye Goh, Toros91, Mark.F Mar 20 at 8:20


    Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.




















      0












      0








      0





      $begingroup$


      So I have a dataset of short videos with simple content and humans describing them, for example ("ball is pushed"; "cube is grabbed"; "book is grabbed"; etc.).
      What strategies are there in order to learn to describe a new video (like "cube is pushed").
      I have no concrete classes I want to classify the videos in but need to generate a natural language sentence.



      Do you know some exemplary research I could study or what techniques I would have to use?










      share|improve this question







      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      So I have a dataset of short videos with simple content and humans describing them, for example ("ball is pushed"; "cube is grabbed"; "book is grabbed"; etc.).
      What strategies are there in order to learn to describe a new video (like "cube is pushed").
      I have no concrete classes I want to classify the videos in but need to generate a natural language sentence.



      Do you know some exemplary research I could study or what techniques I would have to use?







      machine-learning natural-language-process labels






      share|improve this question







      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Mar 19 at 14:33









      arcGuesserarcGuesser

      1




      1




      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




      put on hold as too broad by Spacedman, Ethan, Siong Thye Goh, Toros91, Mark.F Mar 20 at 8:20


      Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.









      put on hold as too broad by Spacedman, Ethan, Siong Thye Goh, Toros91, Mark.F Mar 20 at 8:20


      Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



          enter image description here



          This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



          enter image description here






          share|improve this answer









          $endgroup$



















            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0












            $begingroup$

            For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



            enter image description here



            This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



            enter image description here






            share|improve this answer









            $endgroup$

















              0












              $begingroup$

              For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



              enter image description here



              This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



              enter image description here






              share|improve this answer









              $endgroup$















                0












                0








                0





                $begingroup$

                For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



                enter image description here



                This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



                enter image description here






                share|improve this answer









                $endgroup$



                For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



                enter image description here



                This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



                enter image description here







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 20 at 4:39









                Shamit VermaShamit Verma

                91929




                91929













                    Popular posts from this blog

                    Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                    Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

                    Do these cracks on my tires look bad? The Next CEO of Stack OverflowDry rot tire should I replace?Having to replace tiresFishtailed so easily? Bad tires? ABS?Filling the tires with something other than air, to avoid puncture hassles?Used Michelin tires safe to install?Do these tyre cracks necessitate replacement?Rumbling noise: tires or mechanicalIs it possible to fix noisy feathered tires?Are bad winter tires still better than summer tires in winter?Torque converter failure - Related to replacing only 2 tires?Why use snow tires on all 4 wheels on 2-wheel-drive cars?