Can video descriptions be learned from videos + human descriptions? [on hold]Attributes extraction from unstructured product descriptionsCan the size of a pooling layer be learned?Boolean classification on stringsWhich machine (or deep) learning methods could suit my text classification problem?Machine learning system which can learn from data and human rulesTraining an AI to play Starcraft 2 with superhuman level of performance?Can I analyze Video in AI?Video classification of birdsMachine Learning and Natural Language Processing : Project InitiationUsing ontology to infer labels for process model

How do you respond to a colleague from another team when they're wrongly expecting that you'll help them?

How should I respond when I lied about my education and the company finds out through background check?

By means of an example, show that P(A) + P(B) = 1 does not mean that B is the complement of A.

Why is so much work done on numerical verification of the Riemann Hypothesis?

Is it better practice to read straight from sheet music rather than memorize it?

why `nmap 192.168.1.97` returns less services than `nmap 127.0.0.1`?

Why does the Sun have different day lengths, but not the gas giants?

Can I use Seifert-van Kampen theorem infinite times

Varistor? Purpose and principle

Should I stop contributing to retirement accounts?

Why are synthetic pH indicators used over natural indicators?

Need a math help for the Cagan's model in macroeconomics

Open a doc from terminal, but not by its name

What should you do when eye contact makes your subordinate uncomfortable?

MTG Artifact and Enchantment Rulings

How can Trident be so inexpensive? Will it orbit Triton or just do a (slow) flyby?

Is there an efficient solution to the travelling salesman problem with binary edge weights?

Will the technology I first learn determine the direction of my future career?

Redundant comparison & "if" before assignment

Store Credit Card Information in Password Manager?

Longest common substring in linear time

Why did the HMS Bounty go back to a time when whales are already rare?

Loading commands from file

Where did Heinlein say "Once you get to Earth orbit, you're halfway to anywhere in the Solar System"?



Can video descriptions be learned from videos + human descriptions? [on hold]


Attributes extraction from unstructured product descriptionsCan the size of a pooling layer be learned?Boolean classification on stringsWhich machine (or deep) learning methods could suit my text classification problem?Machine learning system which can learn from data and human rulesTraining an AI to play Starcraft 2 with superhuman level of performance?Can I analyze Video in AI?Video classification of birdsMachine Learning and Natural Language Processing : Project InitiationUsing ontology to infer labels for process model













0












$begingroup$


So I have a dataset of short videos with simple content and humans describing them, for example ("ball is pushed"; "cube is grabbed"; "book is grabbed"; etc.).
What strategies are there in order to learn to describe a new video (like "cube is pushed").
I have no concrete classes I want to classify the videos in but need to generate a natural language sentence.



Do you know some exemplary research I could study or what techniques I would have to use?










share|improve this question







New contributor




arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$



put on hold as too broad by Spacedman, Ethan, Siong Thye Goh, Toros91, Mark.F Mar 20 at 8:20


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






















    0












    $begingroup$


    So I have a dataset of short videos with simple content and humans describing them, for example ("ball is pushed"; "cube is grabbed"; "book is grabbed"; etc.).
    What strategies are there in order to learn to describe a new video (like "cube is pushed").
    I have no concrete classes I want to classify the videos in but need to generate a natural language sentence.



    Do you know some exemplary research I could study or what techniques I would have to use?










    share|improve this question







    New contributor




    arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$



    put on hold as too broad by Spacedman, Ethan, Siong Thye Goh, Toros91, Mark.F Mar 20 at 8:20


    Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.




















      0












      0








      0





      $begingroup$


      So I have a dataset of short videos with simple content and humans describing them, for example ("ball is pushed"; "cube is grabbed"; "book is grabbed"; etc.).
      What strategies are there in order to learn to describe a new video (like "cube is pushed").
      I have no concrete classes I want to classify the videos in but need to generate a natural language sentence.



      Do you know some exemplary research I could study or what techniques I would have to use?










      share|improve this question







      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      So I have a dataset of short videos with simple content and humans describing them, for example ("ball is pushed"; "cube is grabbed"; "book is grabbed"; etc.).
      What strategies are there in order to learn to describe a new video (like "cube is pushed").
      I have no concrete classes I want to classify the videos in but need to generate a natural language sentence.



      Do you know some exemplary research I could study or what techniques I would have to use?







      machine-learning natural-language-process labels






      share|improve this question







      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Mar 19 at 14:33









      arcGuesserarcGuesser

      1




      1




      New contributor




      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      arcGuesser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




      put on hold as too broad by Spacedman, Ethan, Siong Thye Goh, Toros91, Mark.F Mar 20 at 8:20


      Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.









      put on hold as too broad by Spacedman, Ethan, Siong Thye Goh, Toros91, Mark.F Mar 20 at 8:20


      Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



          enter image description here



          This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



          enter image description here






          share|improve this answer









          $endgroup$



















            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0












            $begingroup$

            For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



            enter image description here



            This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



            enter image description here






            share|improve this answer









            $endgroup$

















              0












              $begingroup$

              For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



              enter image description here



              This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



              enter image description here






              share|improve this answer









              $endgroup$















                0












                0








                0





                $begingroup$

                For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



                enter image description here



                This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



                enter image description here






                share|improve this answer









                $endgroup$



                For this purpose, you need to train multiple seq2seq models. For example a model to learn patterns of video frames and another for merging language models + embedding to learn features.



                enter image description here



                This paper has a good overview of problem domain and 2 promising solutions : http://cs231n.stanford.edu/reports/2017/pdfs/31.pdf



                enter image description here







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 20 at 4:39









                Shamit VermaShamit Verma

                91929




                91929













                    Popular posts from this blog

                    Marja Vauras Lähteet | Aiheesta muualla | NavigointivalikkoMarja Vauras Turun yliopiston tutkimusportaalissaInfobox OKSuomalaisen Tiedeakatemian varsinaiset jäsenetKasvatustieteiden tiedekunnan dekaanit ja muu johtoMarja VaurasKoulutusvienti on kestävyys- ja ketteryyslaji (2.5.2017)laajentamallaWorldCat Identities0000 0001 0855 9405n86069603utb201588738523620927

                    Which is better: GPT or RelGAN for text generation?2019 Community Moderator ElectionWhat is the difference between TextGAN and LM for text generation?GANs (generative adversarial networks) possible for text as well?Generator loss not decreasing- text to image synthesisChoosing a right algorithm for template-based text generationHow should I format input and output for text generation with LSTMsGumbel Softmax vs Vanilla Softmax for GAN trainingWhich neural network to choose for classification from text/speech?NLP text autoencoder that generates text in poetic meterWhat is the interpretation of the expectation notation in the GAN formulation?What is the difference between TextGAN and LM for text generation?How to prepare the data for text generation task

                    Is this part of the description of the Archfey warlock's Misty Escape feature redundant?When is entropic ward considered “used”?How does the reaction timing work for Wrath of the Storm? Can it potentially prevent the damage from the triggering attack?Does the Dark Arts Archlich warlock patrons's Arcane Invisibility activate every time you cast a level 1+ spell?When attacking while invisible, when exactly does invisibility break?Can I cast Hellish Rebuke on my turn?Do I have to “pre-cast” a reaction spell in order for it to be triggered?What happens if a Player Misty Escapes into an Invisible CreatureCan a reaction interrupt multiattack?Does the Fiend-patron warlock's Hurl Through Hell feature dispel effects that require the target to be on the same plane as the caster?What are you allowed to do while using the Warlock's Eldritch Master feature?