watex.transformers.StratifiedWithCategoryAdder#
- class watex.transformers.StratifiedWithCategoryAdder(base_num_feature=None, threshold_operator=1.0, return_train=False, max_category=3, n_splits=1, test_size=0.2, random_state=42)[source]#
Stratified sampling transformer based on new generated category from numerical attributes and return stratified trainset and test set.
- Parameters:
*base_num_feature* (str,) – Numerical features to categorize.
*threshold_operator* (float,) – The coefficient to divised the numerical features value to normalize the data
*max_category* (Maximum value fits a max category to gather all) – value greather than.
*return_train* (bool,) – Return the whole stratified trainset if set to
True. usefull when the dataset is not enough. It is convenient to train all the whole trainset rather than a small amount of stratified data. Sometimes all the stratified data are not the similar equal one to another especially when the dataset is not enough.and (Another way to stratify dataset is to get insights from the dataset) –
attributes (to add a new category as additional mileage. From this new) –
:param : :param data could be stratified after categorizing numerical features.: :param Once data is tratified: :param the new category will be drop and return the: :param train set and testset stratified. For instance::: >>> from watex.transformers import StratifiedWithCategoryAdder
>>> stratifiedNumObj= StratifiedWithCatogoryAdder('flow') >>> stratifiedNumObj.fit_transform(X=df) >>> stats2 = stratifiedNumObj.statistics_
- Parameters:
Usage –
------ –
example (In this) –
using (we firstly categorize the flow attribute) –
:param the ceilvalue (see
discretizeCategoriesforStratification()): :param and groupby other values greater than themax_categoryvalue to the: :parammax_categoryandput in the temporary features. From this features: :param the categorization is performed and stratified the trainset and: :param the test set.:Notes
If base_num_feature is not given, dataset will be stratified using random sampling.