Combinable filtersFiltering with multiple inclusion and exclusion patternsSubset sum whose set contains only positive integersList comprehension methodApplying dynamic filters to a SQL query using Python conditional expressionsParse query filter expression from request URLFind smallest subset prefixesAdapter for querying incompatible systemsCode to perform validations on time series dataFind minimum and maximum numbersDecorate a python function to work as a Google Cloud Function
Symbolic Multivariate Distribution
What makes accurate emulation of old systems a difficult task?
How can I practically buy stocks?
Why does nature favour the Laplacian?
Why was the Spitfire's elliptical wing almost uncopied by other aircraft of World War 2?
A Note on N!
French for 'It must be my imagination'?
Binary Numbers Magic Trick
How can I place the product on a social media post better?
How do I deal with a coworker that keeps asking to make small superficial changes to a report, and it is seriously triggering my anxiety?
Why is it that the natural deduction method can't test for invalidity?
How to reduce LED flash rate (frequency)
Does Gita support doctrine of eternal cycle of birth and death for evil people?
Error message with Tabularx
Pulling the rope with one hand is as heavy as with two hands?
Is the claim "Employers won't employ people with no 'social media presence'" realistic?
Map of water taps to fill bottles
How much cash can I safely carry into the USA and avoid civil forfeiture?
Do I have to worry about players making “bad” choices on level up?
Is there an official tutorial for installing Ubuntu 18.04+ on a device with an SSD and an additional internal hard drive?
How did Captain America manage to do this?
What's the polite way to say "I need to urinate"?
Is there really no use for MD5 anymore?
Was there a Viking Exchange as well as a Columbian one?
Combinable filters
Filtering with multiple inclusion and exclusion patternsSubset sum whose set contains only positive integersList comprehension methodApplying dynamic filters to a SQL query using Python conditional expressionsParse query filter expression from request URLFind smallest subset prefixesAdapter for querying incompatible systemsCode to perform validations on time series dataFind minimum and maximum numbersDecorate a python function to work as a Google Cloud Function
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I have an initial pool of subjects, then I need to apply a set of general criteria to retain a smaller subset (SS1) of subjects. Then I need to divide this smaller subset (SS1) into yet finer subsets (SS1-A, SS1-B and the rest). A specific set of criteria will be applied to SS1 to obtain the SS1-A, while another set of specific criteria will be applied to obtain the SS1-B, and the rest will be discarded. The set of criteria/filter will need to be flexible, I would like to add, remove, or combine filters for testing and development, as well as for further clients' requests.
I created a small structure code below to help me understand and test the implementation of template method and filter methods. I use a list and some filter instead of actual subject pool, but the idea is similar that the list items can be seen as subjects with different attributes.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
class Criteria():
@abstractmethod
def filter(self, request):
raise NotImplementedError('Should have implemented this.')
class CombinedFilter(Criteria):
def __init__(self, filter1, filter2):
self.filter1 = filter1
self.filter2 = filter2
def filter(self, this_list):
filteredList1 = self.filter1.filter(this_list)
filteredList2 = self.filter2.filter(filteredList1)
return filteredList2
class MaxFilter(Criteria):
def __init__(self, max_val=100):
self.max_val = max_val
def filter(self, this_list):
filteredList = []
for item in this_list:
if item <= self.max_val:
filteredList.append(item)
return filteredList
class MinFilter(Criteria):
def __init__(self, min_val=10):
self.min_val = min_val
def filter(self, this_list):
filteredList = []
for item in this_list:
if item >= self.min_val:
filteredList.append(item)
return filteredList
class TwentyThreeFilter(Criteria):
def __init__(self): pass
def filter(self, this_list):
filteredList = []
for item in this_list:
if item != 23:
filteredList.append(item)
return filteredList
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = MaxFilter()
this_list2 = ob.filter(this_list)
print(this_list2)
ob2 = MinFilter()
this_list3 = ob2.filter(this_list2)
print(this_list3)
ob3 = CombinedFilter(ob, ob2)
this_list4 = ob3.filter(this_list)
print(this_list4)
ob4 = DataProcessing_Project1(my_list=this_list)
ob4.data_processing_steps()
print(ob4.return_list())
ob5 = DataProcessing_Project1_SubjectA(my_list=this_list)
ob5.data_processing_steps()
print(ob5.return_list())
# Error
twentythreefilter_obj = TwentyThreeFilter()
ob6 = CombinedFilter(ob, ob2, twentythreefilter_obj)
this_list4 = ob3.filter(this_list)
print(this_list4)
I am fairly new to design pattern, I wonder if this is implemented correctly, and if there are areas that can be improved?
Also for ob6
, I would like to add another filter as a parameter for combinedFilter()
, but I am not sure how to set the __init__
and filter()
within the ComninedFilter
class so that it can accommodate the addition of any number of new filters.
python python-3.x object-oriented
$endgroup$
add a comment |
$begingroup$
I have an initial pool of subjects, then I need to apply a set of general criteria to retain a smaller subset (SS1) of subjects. Then I need to divide this smaller subset (SS1) into yet finer subsets (SS1-A, SS1-B and the rest). A specific set of criteria will be applied to SS1 to obtain the SS1-A, while another set of specific criteria will be applied to obtain the SS1-B, and the rest will be discarded. The set of criteria/filter will need to be flexible, I would like to add, remove, or combine filters for testing and development, as well as for further clients' requests.
I created a small structure code below to help me understand and test the implementation of template method and filter methods. I use a list and some filter instead of actual subject pool, but the idea is similar that the list items can be seen as subjects with different attributes.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
class Criteria():
@abstractmethod
def filter(self, request):
raise NotImplementedError('Should have implemented this.')
class CombinedFilter(Criteria):
def __init__(self, filter1, filter2):
self.filter1 = filter1
self.filter2 = filter2
def filter(self, this_list):
filteredList1 = self.filter1.filter(this_list)
filteredList2 = self.filter2.filter(filteredList1)
return filteredList2
class MaxFilter(Criteria):
def __init__(self, max_val=100):
self.max_val = max_val
def filter(self, this_list):
filteredList = []
for item in this_list:
if item <= self.max_val:
filteredList.append(item)
return filteredList
class MinFilter(Criteria):
def __init__(self, min_val=10):
self.min_val = min_val
def filter(self, this_list):
filteredList = []
for item in this_list:
if item >= self.min_val:
filteredList.append(item)
return filteredList
class TwentyThreeFilter(Criteria):
def __init__(self): pass
def filter(self, this_list):
filteredList = []
for item in this_list:
if item != 23:
filteredList.append(item)
return filteredList
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = MaxFilter()
this_list2 = ob.filter(this_list)
print(this_list2)
ob2 = MinFilter()
this_list3 = ob2.filter(this_list2)
print(this_list3)
ob3 = CombinedFilter(ob, ob2)
this_list4 = ob3.filter(this_list)
print(this_list4)
ob4 = DataProcessing_Project1(my_list=this_list)
ob4.data_processing_steps()
print(ob4.return_list())
ob5 = DataProcessing_Project1_SubjectA(my_list=this_list)
ob5.data_processing_steps()
print(ob5.return_list())
# Error
twentythreefilter_obj = TwentyThreeFilter()
ob6 = CombinedFilter(ob, ob2, twentythreefilter_obj)
this_list4 = ob3.filter(this_list)
print(this_list4)
I am fairly new to design pattern, I wonder if this is implemented correctly, and if there are areas that can be improved?
Also for ob6
, I would like to add another filter as a parameter for combinedFilter()
, but I am not sure how to set the __init__
and filter()
within the ComninedFilter
class so that it can accommodate the addition of any number of new filters.
python python-3.x object-oriented
$endgroup$
add a comment |
$begingroup$
I have an initial pool of subjects, then I need to apply a set of general criteria to retain a smaller subset (SS1) of subjects. Then I need to divide this smaller subset (SS1) into yet finer subsets (SS1-A, SS1-B and the rest). A specific set of criteria will be applied to SS1 to obtain the SS1-A, while another set of specific criteria will be applied to obtain the SS1-B, and the rest will be discarded. The set of criteria/filter will need to be flexible, I would like to add, remove, or combine filters for testing and development, as well as for further clients' requests.
I created a small structure code below to help me understand and test the implementation of template method and filter methods. I use a list and some filter instead of actual subject pool, but the idea is similar that the list items can be seen as subjects with different attributes.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
class Criteria():
@abstractmethod
def filter(self, request):
raise NotImplementedError('Should have implemented this.')
class CombinedFilter(Criteria):
def __init__(self, filter1, filter2):
self.filter1 = filter1
self.filter2 = filter2
def filter(self, this_list):
filteredList1 = self.filter1.filter(this_list)
filteredList2 = self.filter2.filter(filteredList1)
return filteredList2
class MaxFilter(Criteria):
def __init__(self, max_val=100):
self.max_val = max_val
def filter(self, this_list):
filteredList = []
for item in this_list:
if item <= self.max_val:
filteredList.append(item)
return filteredList
class MinFilter(Criteria):
def __init__(self, min_val=10):
self.min_val = min_val
def filter(self, this_list):
filteredList = []
for item in this_list:
if item >= self.min_val:
filteredList.append(item)
return filteredList
class TwentyThreeFilter(Criteria):
def __init__(self): pass
def filter(self, this_list):
filteredList = []
for item in this_list:
if item != 23:
filteredList.append(item)
return filteredList
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = MaxFilter()
this_list2 = ob.filter(this_list)
print(this_list2)
ob2 = MinFilter()
this_list3 = ob2.filter(this_list2)
print(this_list3)
ob3 = CombinedFilter(ob, ob2)
this_list4 = ob3.filter(this_list)
print(this_list4)
ob4 = DataProcessing_Project1(my_list=this_list)
ob4.data_processing_steps()
print(ob4.return_list())
ob5 = DataProcessing_Project1_SubjectA(my_list=this_list)
ob5.data_processing_steps()
print(ob5.return_list())
# Error
twentythreefilter_obj = TwentyThreeFilter()
ob6 = CombinedFilter(ob, ob2, twentythreefilter_obj)
this_list4 = ob3.filter(this_list)
print(this_list4)
I am fairly new to design pattern, I wonder if this is implemented correctly, and if there are areas that can be improved?
Also for ob6
, I would like to add another filter as a parameter for combinedFilter()
, but I am not sure how to set the __init__
and filter()
within the ComninedFilter
class so that it can accommodate the addition of any number of new filters.
python python-3.x object-oriented
$endgroup$
I have an initial pool of subjects, then I need to apply a set of general criteria to retain a smaller subset (SS1) of subjects. Then I need to divide this smaller subset (SS1) into yet finer subsets (SS1-A, SS1-B and the rest). A specific set of criteria will be applied to SS1 to obtain the SS1-A, while another set of specific criteria will be applied to obtain the SS1-B, and the rest will be discarded. The set of criteria/filter will need to be flexible, I would like to add, remove, or combine filters for testing and development, as well as for further clients' requests.
I created a small structure code below to help me understand and test the implementation of template method and filter methods. I use a list and some filter instead of actual subject pool, but the idea is similar that the list items can be seen as subjects with different attributes.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
class Criteria():
@abstractmethod
def filter(self, request):
raise NotImplementedError('Should have implemented this.')
class CombinedFilter(Criteria):
def __init__(self, filter1, filter2):
self.filter1 = filter1
self.filter2 = filter2
def filter(self, this_list):
filteredList1 = self.filter1.filter(this_list)
filteredList2 = self.filter2.filter(filteredList1)
return filteredList2
class MaxFilter(Criteria):
def __init__(self, max_val=100):
self.max_val = max_val
def filter(self, this_list):
filteredList = []
for item in this_list:
if item <= self.max_val:
filteredList.append(item)
return filteredList
class MinFilter(Criteria):
def __init__(self, min_val=10):
self.min_val = min_val
def filter(self, this_list):
filteredList = []
for item in this_list:
if item >= self.min_val:
filteredList.append(item)
return filteredList
class TwentyThreeFilter(Criteria):
def __init__(self): pass
def filter(self, this_list):
filteredList = []
for item in this_list:
if item != 23:
filteredList.append(item)
return filteredList
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = MaxFilter()
this_list2 = ob.filter(this_list)
print(this_list2)
ob2 = MinFilter()
this_list3 = ob2.filter(this_list2)
print(this_list3)
ob3 = CombinedFilter(ob, ob2)
this_list4 = ob3.filter(this_list)
print(this_list4)
ob4 = DataProcessing_Project1(my_list=this_list)
ob4.data_processing_steps()
print(ob4.return_list())
ob5 = DataProcessing_Project1_SubjectA(my_list=this_list)
ob5.data_processing_steps()
print(ob5.return_list())
# Error
twentythreefilter_obj = TwentyThreeFilter()
ob6 = CombinedFilter(ob, ob2, twentythreefilter_obj)
this_list4 = ob3.filter(this_list)
print(this_list4)
I am fairly new to design pattern, I wonder if this is implemented correctly, and if there are areas that can be improved?
Also for ob6
, I would like to add another filter as a parameter for combinedFilter()
, but I am not sure how to set the __init__
and filter()
within the ComninedFilter
class so that it can accommodate the addition of any number of new filters.
python python-3.x object-oriented
python python-3.x object-oriented
edited 17 hours ago
200_success
132k20158423
132k20158423
asked 17 hours ago
KubiK888KubiK888
1484
1484
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
$endgroup$
$begingroup$
Is your don't write class comment pointing to my boilerplate code above, or in general? I see that many design patterns (ie abstract factory - sourcemaking.com/design_patterns/abstract_factory/python/1) use a lot of empty classes (with pass) as interface, does it mean refactoring code with these design patterns are generally bad? My code above was a subsequent question that I tried followingGirish
's comments (stackoverflow.com/questions/55858784/…)
$endgroup$
– KubiK888
5 hours ago
$begingroup$
Also, my above code is basically a test code. My data will be as Pandas df which I will be using different var column as attributes to filter subjects. Would the iterable suggestion still apply? If so, do I treat each row/subject as an iterable?
$endgroup$
– KubiK888
5 hours ago
$begingroup$
A lot of "design patterns" are just workarounds for limitations of straitjacket OOP languages like Java or C++. Watch the video.
$endgroup$
– 200_success
5 hours ago
1
$begingroup$
We encourage users to post real code for review, because we can't review what you had in mind but decided not to show. (See How to Ask.) If you have another question, then post it as a separate follow-up question.
$endgroup$
– 200_success
5 hours ago
$begingroup$
Thanks, appreciated.
$endgroup$
– KubiK888
4 hours ago
add a comment |
$begingroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f219228%2fcombinable-filters%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
$endgroup$
$begingroup$
Is your don't write class comment pointing to my boilerplate code above, or in general? I see that many design patterns (ie abstract factory - sourcemaking.com/design_patterns/abstract_factory/python/1) use a lot of empty classes (with pass) as interface, does it mean refactoring code with these design patterns are generally bad? My code above was a subsequent question that I tried followingGirish
's comments (stackoverflow.com/questions/55858784/…)
$endgroup$
– KubiK888
5 hours ago
$begingroup$
Also, my above code is basically a test code. My data will be as Pandas df which I will be using different var column as attributes to filter subjects. Would the iterable suggestion still apply? If so, do I treat each row/subject as an iterable?
$endgroup$
– KubiK888
5 hours ago
$begingroup$
A lot of "design patterns" are just workarounds for limitations of straitjacket OOP languages like Java or C++. Watch the video.
$endgroup$
– 200_success
5 hours ago
1
$begingroup$
We encourage users to post real code for review, because we can't review what you had in mind but decided not to show. (See How to Ask.) If you have another question, then post it as a separate follow-up question.
$endgroup$
– 200_success
5 hours ago
$begingroup$
Thanks, appreciated.
$endgroup$
– KubiK888
4 hours ago
add a comment |
$begingroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
$endgroup$
$begingroup$
Is your don't write class comment pointing to my boilerplate code above, or in general? I see that many design patterns (ie abstract factory - sourcemaking.com/design_patterns/abstract_factory/python/1) use a lot of empty classes (with pass) as interface, does it mean refactoring code with these design patterns are generally bad? My code above was a subsequent question that I tried followingGirish
's comments (stackoverflow.com/questions/55858784/…)
$endgroup$
– KubiK888
5 hours ago
$begingroup$
Also, my above code is basically a test code. My data will be as Pandas df which I will be using different var column as attributes to filter subjects. Would the iterable suggestion still apply? If so, do I treat each row/subject as an iterable?
$endgroup$
– KubiK888
5 hours ago
$begingroup$
A lot of "design patterns" are just workarounds for limitations of straitjacket OOP languages like Java or C++. Watch the video.
$endgroup$
– 200_success
5 hours ago
1
$begingroup$
We encourage users to post real code for review, because we can't review what you had in mind but decided not to show. (See How to Ask.) If you have another question, then post it as a separate follow-up question.
$endgroup$
– 200_success
5 hours ago
$begingroup$
Thanks, appreciated.
$endgroup$
– KubiK888
4 hours ago
add a comment |
$begingroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
$endgroup$
Your approach is suitable for a language like Java. But in Python? Stop writing classes! This is especially true for your task, where much of the code consists of do-nothing placeholders (in bold below) just to allow functionality to be implemented by subclasses.
from abc import ABC, abstractmethod
class DataProcessing(ABC):
def __init__(self, my_list):
self.my_list = my_list
def data_processing_steps(self):
self.remove_duplicate()
self.general_filtering()
self.subject_specific_filtering()
self.return_list()
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
@abstractmethod
def general_filtering(self): pass
def subject_specific_filtering(self): pass
def return_list(self):
return self.my_list
class DataProcessing_Project1(DataProcessing):
def general_filtering(self):
maxfilter_obj = MaxFilter()
minfilter_obj = MinFilter()
CombinedFilter_obj = CombinedFilter(maxfilter_obj, minfilter_obj)
self.my_list = CombinedFilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectA(DataProcessing_Project1):
def subject_specific_filtering(self):
twentythreefilter_obj = TwentyThreeFilter()
self.my_list = twentythreefilter_obj.filter(self.my_list)
class DataProcessing_Project1_SubjectB(DataProcessing_Project1): pass
Furthermore, it's unnatural to have my_list
be part of the state of the DataProcessing
instance, and it's especially awkward to have to retrieve the result by calling .return_list()
.
Note that in
def remove_duplicate(self):
self.my_list = set(list(self.my_list))
… my_list
temporarily becomes a set
rather than a list
. You should have written self.my_list = list(set(self.my_list))
instead.
Suggested solution
This task is more naturally suited to functional programming. Each filter can be a function that accepts an iterable and returns an iterable. You can then easily combine filters through function composition.
As a bonus, you can take advantage of default parameter values in Python to supply generic processing steps. Then, just use None
to indicate that an absent processing step.
######################################################################
# Primitive filters
######################################################################
def deduplicator():
return lambda iterable: list(set(iterable))
def at_least(threshold=10):
return lambda iterable: [n for n in iterable if n >= threshold]
def at_most(threshold=100):
return lambda iterable: [n for n in iterable if n <= threshold]
def is_not(bad_value):
return lambda iterable: [n for n in iterable if n != bad_value]
######################################################################
# Higher-order filters
######################################################################
def compose(*filters):
def composed(iterable):
for f in filters:
if f is not None:
iterable = f(iterable)
return iterable
return composed
def data_processing(
deduplicate=deduplicator(),
general=compose(at_least(), at_most()),
specific=None,
):
return compose(deduplicate, general, specific)
######################################################################
# Demonstration
######################################################################
this_list = [1, 2, 23, 4, 34, 456, 234, 23, 3457, 5, 2]
ob = at_most()
this_list2 = ob(this_list)
print(this_list2) # [1, 2, 23, 4, 34, 23, 5, 2]
ob2 = at_least()
this_list3 = ob2(this_list2)
print(this_list3) # [23, 34, 23]
ob3 = compose(ob, ob2)
this_list4 = ob3(this_list)
print(this_list4) # [23, 34, 23]
ob4 = data_processing()
print(ob4(this_list)) # [34, 23]
ob5 = data_processing(specific=is_not(23))
print(ob5(this_list)) # [34]
ob6 = compose(ob, ob2, is_not(23))
print(ob6(this_list)) # [34]
edited 15 hours ago
answered 15 hours ago
200_success200_success
132k20158423
132k20158423
$begingroup$
Is your don't write class comment pointing to my boilerplate code above, or in general? I see that many design patterns (ie abstract factory - sourcemaking.com/design_patterns/abstract_factory/python/1) use a lot of empty classes (with pass) as interface, does it mean refactoring code with these design patterns are generally bad? My code above was a subsequent question that I tried followingGirish
's comments (stackoverflow.com/questions/55858784/…)
$endgroup$
– KubiK888
5 hours ago
$begingroup$
Also, my above code is basically a test code. My data will be as Pandas df which I will be using different var column as attributes to filter subjects. Would the iterable suggestion still apply? If so, do I treat each row/subject as an iterable?
$endgroup$
– KubiK888
5 hours ago
$begingroup$
A lot of "design patterns" are just workarounds for limitations of straitjacket OOP languages like Java or C++. Watch the video.
$endgroup$
– 200_success
5 hours ago
1
$begingroup$
We encourage users to post real code for review, because we can't review what you had in mind but decided not to show. (See How to Ask.) If you have another question, then post it as a separate follow-up question.
$endgroup$
– 200_success
5 hours ago
$begingroup$
Thanks, appreciated.
$endgroup$
– KubiK888
4 hours ago
add a comment |
$begingroup$
Is your don't write class comment pointing to my boilerplate code above, or in general? I see that many design patterns (ie abstract factory - sourcemaking.com/design_patterns/abstract_factory/python/1) use a lot of empty classes (with pass) as interface, does it mean refactoring code with these design patterns are generally bad? My code above was a subsequent question that I tried followingGirish
's comments (stackoverflow.com/questions/55858784/…)
$endgroup$
– KubiK888
5 hours ago
$begingroup$
Also, my above code is basically a test code. My data will be as Pandas df which I will be using different var column as attributes to filter subjects. Would the iterable suggestion still apply? If so, do I treat each row/subject as an iterable?
$endgroup$
– KubiK888
5 hours ago
$begingroup$
A lot of "design patterns" are just workarounds for limitations of straitjacket OOP languages like Java or C++. Watch the video.
$endgroup$
– 200_success
5 hours ago
1
$begingroup$
We encourage users to post real code for review, because we can't review what you had in mind but decided not to show. (See How to Ask.) If you have another question, then post it as a separate follow-up question.
$endgroup$
– 200_success
5 hours ago
$begingroup$
Thanks, appreciated.
$endgroup$
– KubiK888
4 hours ago
$begingroup$
Is your don't write class comment pointing to my boilerplate code above, or in general? I see that many design patterns (ie abstract factory - sourcemaking.com/design_patterns/abstract_factory/python/1) use a lot of empty classes (with pass) as interface, does it mean refactoring code with these design patterns are generally bad? My code above was a subsequent question that I tried following
Girish
's comments (stackoverflow.com/questions/55858784/…)$endgroup$
– KubiK888
5 hours ago
$begingroup$
Is your don't write class comment pointing to my boilerplate code above, or in general? I see that many design patterns (ie abstract factory - sourcemaking.com/design_patterns/abstract_factory/python/1) use a lot of empty classes (with pass) as interface, does it mean refactoring code with these design patterns are generally bad? My code above was a subsequent question that I tried following
Girish
's comments (stackoverflow.com/questions/55858784/…)$endgroup$
– KubiK888
5 hours ago
$begingroup$
Also, my above code is basically a test code. My data will be as Pandas df which I will be using different var column as attributes to filter subjects. Would the iterable suggestion still apply? If so, do I treat each row/subject as an iterable?
$endgroup$
– KubiK888
5 hours ago
$begingroup$
Also, my above code is basically a test code. My data will be as Pandas df which I will be using different var column as attributes to filter subjects. Would the iterable suggestion still apply? If so, do I treat each row/subject as an iterable?
$endgroup$
– KubiK888
5 hours ago
$begingroup$
A lot of "design patterns" are just workarounds for limitations of straitjacket OOP languages like Java or C++. Watch the video.
$endgroup$
– 200_success
5 hours ago
$begingroup$
A lot of "design patterns" are just workarounds for limitations of straitjacket OOP languages like Java or C++. Watch the video.
$endgroup$
– 200_success
5 hours ago
1
1
$begingroup$
We encourage users to post real code for review, because we can't review what you had in mind but decided not to show. (See How to Ask.) If you have another question, then post it as a separate follow-up question.
$endgroup$
– 200_success
5 hours ago
$begingroup$
We encourage users to post real code for review, because we can't review what you had in mind but decided not to show. (See How to Ask.) If you have another question, then post it as a separate follow-up question.
$endgroup$
– 200_success
5 hours ago
$begingroup$
Thanks, appreciated.
$endgroup$
– KubiK888
4 hours ago
$begingroup$
Thanks, appreciated.
$endgroup$
– KubiK888
4 hours ago
add a comment |
$begingroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
$endgroup$
add a comment |
$begingroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
$endgroup$
add a comment |
$begingroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
$endgroup$
I think you would benefit from viewing your processing steps and criteria as filters that operate on iterables.
Suppose you have a sequence, like a set
or a list
or a tuple
. You could iterate over that sequence like so:
for item in sequence:
pass
Now suppose you use the iter()
built-in function to create an iterator, instead. Now you can pass around that iterator, and even extract values from it:
it = iter(sequence)
first_item = next(it)
print_remaining_items(it)
Finally, suppose you take advantage of generator functions and avoid collecting and returning entire lists. You can iterate over the elements of an iterable, inspect the individual values, and yield the ones you choose:
def generator(it):
for item in it:
if choose(item):
yield item
This allows you to process one iterable, and iterate over the results of your function, which makes it another iterable.
Thus, you can build a "stack" of iterables, with your initial sequence (or perhaps just an iterable) at the bottom, and some generator function at each higher level:
ibl = sequence
st1 = generator(ibl)
st2 = generator(st1)
st3 = generator(st2)
for item in st3:
print(item) # Will print chosen items from sequence
So how would this work in practice?
Let's start with a simple use case: you have an iterable, and you wish to filter it using one or more simple conditionals.
class FilteredData:
def __init__(self, ibl):
self.iterable = ibl
self.condition = self.yes
def __iter__(self):
for item in self.ibl:
if self.condition(item):
yield item
def yes(self, item):
return True
obj = FilteredData([1,2,3,4])
for item in obj:
print(item) # 1, 2, 3, 4
obj.condition = lambda item: item % 2 == 0
for item in obj:
print(item) # 2, 4
How can we combine multiple conditions? By "stacking" objects. Wrap one iterable item inside another, and you "compose" the filters:
obj = FilteredData([1,2,3,4])
obj.condition = lambda item: item % 2 == 0
obj2 = FilteredData(obj)
obj2.condition = lambda item: item < 3
for item in obj2:
print(item) # 2
Obviously, you can make things more complex. I'd suggest that you not do that until you establish a clear need.
For example, you could pass in the lambda as part of the constructor. Or subclass FilteredData.
Another example, you could "slurp" up the entire input as part of your __iter__
method in order to compute some aggregate value (like min, max, or average) then yield the values one at a time. It's painful since it consumes O(N) memory instead of just O(1), but sometimes it's necessary. That would require a subclass, or a more complex class.
answered 15 hours ago
Austin HastingsAustin Hastings
8,5021338
8,5021338
add a comment |
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f219228%2fcombinable-filters%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown