Random Forests Feature Selection on Time Series Data2019 Community Moderator ElectionFeature selection using feature importances in random forests with scikit-learnFeature selection for gene expression datasetFeature Selection for K Nearest Neighbour and Decision TreesOrange 3 - Feature selection / importanceDetermining Important Atrributes with Feature SelectionHow to use isolation forest from sklearn to return the positions of anomalies?Multiple time-series predictions with Random Forests (in Python)LSTM Feature selection processFeature selection for time series predictionMultivariate Time Series Binary Classification

A poker game description that does not feel gimmicky

Could a US political party gain complete control over the government by removing checks & balances?

Is there a name of the flying bionic bird?

What is the meaning of "of trouble" in the following sentence?

Patience, young "Padovan"

Can I legally use front facing blue light in the UK?

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

How is it possible for user's password to be changed after storage was encrypted? (on OS X, Android)

Landlord wants to switch my lease to a "Land contract" to "get back at the city"

Is it legal to have the "// (c) 2019 John Smith" header in all files when there are hundreds of contributors?

Does the average primeness of natural numbers tend to zero?

Was there ever an axiom rendered a theorem?

Creating a loop after a break using Markov Chain in Tikz

Is there a familial term for apples and pears?

Why do UK politicians seemingly ignore opinion polls on Brexit?

Is ipsum/ipsa/ipse a third person pronoun, or can it serve other functions?

What are the motivations for publishing new editions of an existing textbook, beyond new discoveries in a field?

Need help identifying/translating a plaque in Tangier, Morocco

Is domain driven design an anti-SQL pattern?

Denied boarding due to overcrowding, Sparpreis ticket. What are my rights?

Are there any other methods to apply to solving simultaneous equations?

Is std::next for vector O(n) or O(1)?

I see my dog run

What do the Banks children have against barley water?

Random Forests Feature Selection on Time Series Data

2019 Community Moderator ElectionFeature selection using feature importances in random forests with scikit-learnFeature selection for gene expression datasetFeature Selection for K Nearest Neighbour and Decision TreesOrange 3 - Feature selection / importanceDetermining Important Atrributes with Feature SelectionHow to use isolation forest from sklearn to return the positions of anomalies?Multiple time-series predictions with Random Forests (in Python)LSTM Feature selection processFeature selection for time series predictionMultivariate Time Series Binary Classification

I have a dataset with N amount of features, each one with 500 instances in time.

Let's say that I have for example, the features x, y, v_x, v_y, a_x, a_y, j_x, j_y. In one sample I have 500 instances (rows in a table), for each feature. In another sample, I got other 500 instances, and a class.

I'd like to select a subset of the features automatically with the Random Forests algorithm. The problem is that the algorithm (I'm using ScikitLearn, RandomForestClassifier), accepts a matrix (2D array) as X input, of size [N_samples, N_features]. If I give the array as it is, that is a vector (len 500) for the feature x, another (len 500) for the feature y, etc., I get a N_samples x N_features x 500 array, which is incompatible with the requirements of RandomForestClassifier.

I tried to unroll the matrix in a vector, like having so 500 x N_features array, but in that way, in the reduction, it considers all the elements independent feature, and breaks my structure.

How can I reduce the features (by selection) (possibly using this algorithm, but open to other libraries and/or algorithms) keeping the time instances consistent?

My goal is to do classification, so forecasting resources are limitedly useful to me. Also I have the requirement that each sample has those occurrences, and I don't have them as separate samples unfortunately.

asked Mar 28 at 22:04

user1714647

101

$begingroup$
Welcome to this site! If you want to treat 500 values per feature as "all or nothing", i.e. not breaking the structure, one way is to use the average for each feature thus reducing 500 to 1.
$endgroup$
– Esmailian
Mar 28 at 22:16

$begingroup$
But the features bring a semantic which kind of gets lost if I just do the average. But I tried a similar thing. I ran the DTW distance for each feature against the a feature-sequence of a target sample (avg of 3-4 target samples), where target's class is the one of active classes (in binarized comparison, one vs all other classes), and still no success. On the class I'm interested I get up to 0.50 precision and 1.00 recall, if I take out the difficult class out, less if I have it
$endgroup$
– user1714647
Mar 28 at 22:59

$begingroup$
Can you say something more about what kind of data this is?
$endgroup$
– jonnor
2 days ago

$begingroup$
What is the performance when you flatten the features? Sometimes it actually works fine, with a strong enough model and enough data...
$endgroup$
– jonnor
2 days ago

add a comment |

I have a dataset with N amount of features, each one with 500 instances in time.

I tried to unroll the matrix in a vector, like having so 500 x N_features array, but in that way, in the reduction, it considers all the elements independent feature, and breaks my structure.

How can I reduce the features (by selection) (possibly using this algorithm, but open to other libraries and/or algorithms) keeping the time instances consistent?

asked Mar 28 at 22:04

user1714647

101

$begingroup$
Welcome to this site! If you want to treat 500 values per feature as "all or nothing", i.e. not breaking the structure, one way is to use the average for each feature thus reducing 500 to 1.
$endgroup$
– Esmailian
Mar 28 at 22:16

$begingroup$
But the features bring a semantic which kind of gets lost if I just do the average. But I tried a similar thing. I ran the DTW distance for each feature against the a feature-sequence of a target sample (avg of 3-4 target samples), where target's class is the one of active classes (in binarized comparison, one vs all other classes), and still no success. On the class I'm interested I get up to 0.50 precision and 1.00 recall, if I take out the difficult class out, less if I have it
$endgroup$
– user1714647
Mar 28 at 22:59

$begingroup$
Can you say something more about what kind of data this is?
$endgroup$
– jonnor
2 days ago

$begingroup$
What is the performance when you flatten the features? Sometimes it actually works fine, with a strong enough model and enough data...
$endgroup$
– jonnor
2 days ago

add a comment |

I have a dataset with N amount of features, each one with 500 instances in time.

I tried to unroll the matrix in a vector, like having so 500 x N_features array, but in that way, in the reduction, it considers all the elements independent feature, and breaks my structure.

How can I reduce the features (by selection) (possibly using this algorithm, but open to other libraries and/or algorithms) keeping the time instances consistent?

asked Mar 28 at 22:04

user1714647

101

I have a dataset with N amount of features, each one with 500 instances in time.

I tried to unroll the matrix in a vector, like having so 500 x N_features array, but in that way, in the reduction, it considers all the elements independent feature, and breaks my structure.

How can I reduce the features (by selection) (possibly using this algorithm, but open to other libraries and/or algorithms) keeping the time instances consistent?

python scikit-learn time-series feature-selection random-forest

asked Mar 28 at 22:04

user1714647

101

asked Mar 28 at 22:04

user1714647

101

asked Mar 28 at 22:04

user1714647

101

asked Mar 28 at 22:04

user1714647

101

asked Mar 28 at 22:04

user1714647

101

$begingroup$
Welcome to this site! If you want to treat 500 values per feature as "all or nothing", i.e. not breaking the structure, one way is to use the average for each feature thus reducing 500 to 1.
$endgroup$
– Esmailian
Mar 28 at 22:16

$begingroup$
But the features bring a semantic which kind of gets lost if I just do the average. But I tried a similar thing. I ran the DTW distance for each feature against the a feature-sequence of a target sample (avg of 3-4 target samples), where target's class is the one of active classes (in binarized comparison, one vs all other classes), and still no success. On the class I'm interested I get up to 0.50 precision and 1.00 recall, if I take out the difficult class out, less if I have it
$endgroup$
– user1714647
Mar 28 at 22:59

$begingroup$
Can you say something more about what kind of data this is?
$endgroup$
– jonnor
2 days ago

$begingroup$
What is the performance when you flatten the features? Sometimes it actually works fine, with a strong enough model and enough data...
$endgroup$
– jonnor
2 days ago

add a comment |

$begingroup$
Welcome to this site! If you want to treat 500 values per feature as "all or nothing", i.e. not breaking the structure, one way is to use the average for each feature thus reducing 500 to 1.
$endgroup$
– Esmailian
Mar 28 at 22:16

$begingroup$
But the features bring a semantic which kind of gets lost if I just do the average. But I tried a similar thing. I ran the DTW distance for each feature against the a feature-sequence of a target sample (avg of 3-4 target samples), where target's class is the one of active classes (in binarized comparison, one vs all other classes), and still no success. On the class I'm interested I get up to 0.50 precision and 1.00 recall, if I take out the difficult class out, less if I have it
$endgroup$
– user1714647
Mar 28 at 22:59

$begingroup$
Can you say something more about what kind of data this is?
$endgroup$
– jonnor
2 days ago

$begingroup$
What is the performance when you flatten the features? Sometimes it actually works fine, with a strong enough model and enough data...
$endgroup$
– jonnor
2 days ago

Welcome to this site! If you want to treat 500 values per feature as "all or nothing", i.e. not breaking the structure, one way is to use the average for each feature thus reducing 500 to 1.

– Esmailian
Mar 28 at 22:16

But the features bring a semantic which kind of gets lost if I just do the average. But I tried a similar thing. I ran the DTW distance for each feature against the a feature-sequence of a target sample (avg of 3-4 target samples), where target's class is the one of active classes (in binarized comparison, one vs all other classes), and still no success. On the class I'm interested I get up to 0.50 precision and 1.00 recall, if I take out the difficult class out, less if I have it

– user1714647
Mar 28 at 22:59

Can you say something more about what kind of data this is?

– jonnor
2 days ago

What is the performance when you flatten the features? Sometimes it actually works fine, with a strong enough model and enough data...

– jonnor
2 days ago

add a comment |

2 Answers
2

active

oldest

votes

Some EDA might be needed to create new features for each time-series item. You might want to mine for patterns and have random forest reduce the overfitting. Exactly how mining is done depends on the nature of the problem, which might indicate for things like:

interesting time periods,

events that happen at a time,

time lag between different series,

dynamical systems,

latent variables,

scedasticity

Breiman's landmark paper on random forest gives some theoretical guarantees that random forest works well when individual classifiers are good and the correlation between these individuals are low. This can also be a heuristic to prune features.

answered Mar 29 at 3:38

Yee Sern Tan

add a comment |

If you want to preserve and utilize the 2D structure, use something like a Convolutional Neural Network. Feature selection can be done using L1 regularization. Otherwise you will have to do feature engineering outside the classifier.

This 2D structure with one axis being time is quite similar to spectrograms used in audio, where CNNs are frequently applied. So check out literature on Acoustic Event Recognition and Acoustic Scene Classification for more details.

answered 2 days ago

jonnor

2376

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48183%2frandom-forests-feature-selection-on-time-series-data%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

interesting time periods,

events that happen at a time,

time lag between different series,

dynamical systems,

latent variables,

scedasticity

answered Mar 29 at 3:38

Yee Sern Tan

add a comment |

interesting time periods,

events that happen at a time,

time lag between different series,

dynamical systems,

latent variables,

scedasticity

answered Mar 29 at 3:38

Yee Sern Tan

add a comment |

interesting time periods,

events that happen at a time,

time lag between different series,

dynamical systems,

latent variables,

scedasticity

answered Mar 29 at 3:38

Yee Sern Tan

interesting time periods,

events that happen at a time,

time lag between different series,

dynamical systems,

latent variables,

scedasticity

answered Mar 29 at 3:38

Yee Sern Tan

answered Mar 29 at 3:38

Yee Sern Tan

answered Mar 29 at 3:38

Yee Sern Tan

answered Mar 29 at 3:38

Yee Sern Tan

add a comment |

answered 2 days ago

jonnor

2376

add a comment |

answered 2 days ago

jonnor

2376

add a comment |

answered 2 days ago

jonnor

2376

answered 2 days ago

jonnor

2376

answered 2 days ago

jonnor

2376

answered 2 days ago

jonnor

2376

answered 2 days ago

jonnor

2376

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Trjtdtk

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

2 Answers
2

2 Answers
2

2 Answers
2