Validation accuracy not getting attained completely

I am considering the same dataset (https://autogluon.mxnet.io/tutorials/tabular_prediction/tabular-indepth.html)

Here are the customized hyperparameter tuning i have done for up to 1hr training but still i am nowhere close to 99% of validation accuracy… is it the problem of ml algorithms mature level ? or i am doing some where wrong ?

parameters used:

hp_tune = True # whether or not to do hyperparameter optimization

nn_options = { # specifies non-default hyperparameter values for neural network models
‘num_epochs’: 256, # number of training epochs (controls training time of NN models)
‘learning_rate’: ag.space.Real(1e-4, 3e-2, default=5e-4, log=True), # learning rate used in training (real-valued hyperparameter searched on log-scale)
‘activation’: ag.space.Categorical(‘relu’, ‘softrelu’, ‘softsign’, ‘sigmoid’, ‘tanh’), # activation function used in NN (categorical hyperparameter, default = first entry)
‘layers’: ag.space.Categorical(None, [200, 100], [256], [100, 50], [200, 100, 50], [50, 25], [300, 150]),
# Each choice for categorical hyperparameter ‘layers’ corresponds to list of sizes for each NN layer to use
‘dropout_prob’: ag.space.Real(0.0, 0.5, default=0.1), # dropout probability (real-valued hyperparameter)
‘weight_decay’: ag.space.Real(1e-12, 0.1, default=1e-6, log=True),
‘embedding_size_factor’: ag.space.Real(0.5, 1.5, default=1.0),
‘network_type’: ag.space.Categorical(‘widedeep’,‘feedforward’),
‘use_batchnorm’: ag.space.Categorical(True, False),
#‘batch_size’: ag.space.Int(lower=128, upper=1024, default=256),
}

gbm_options = { # specifies non-default hyperparameter values for lightGBM gradient boosted trees
‘num_boost_round’: 1024, # number of boosting rounds (controls training time of GBM models)
‘num_leaves’: ag.space.Int(lower=32, upper=256, default=128), # number of leaves in trees (integer hyperparameter)
‘objective’: ‘binary’,
‘metric’: ‘binary_logloss,binary_error’,
‘learning_rate’: ag.space.Real(lower=5e-3, upper=0.2, default=0.1, log=True),
‘feature_fraction’: ag.space.Real(lower=0.75, upper=1.0, default=1.0),
‘min_data_in_leaf’: ag.space.Int(lower=2, upper=30, default=20),
‘boosting_type’: ‘gbdt’,
‘verbose’: -1,
‘two_round’: True,
‘seed_value’: None
}

hyperparameters = {‘NN’: nn_options, ‘GBM’: gbm_options} # hyperparameters of each model type
#hyperparameters = {‘GBM’: gbm_options} # hyperparameters of each model type

If one of these keys is missing from hyperparameters dict, then no models of that type are trained.

time_limits = 60*60 # train various models for ~X min
num_trials = 250 # try at most 3 different hyperparameter configurations for each type of model
search_strategy = ‘skopt’ # to tune hyperparameters using SKopt Bayesian optimization routine
output_directory = ‘agModels-predict’ # folder where to store trained models

predictor = task.fit(train_data=train_data, label=label_column,
output_directory=output_directory, time_limits=time_limits, num_trials=num_trials,
hyperparameter_tune=hp_tune, hyperparameters=hyperparameters, num_bagging_folds=5, stack_ensemble_levels=3,
search_strategy=search_strategy)

output:

AutoGluon training complete, total runtime = 3496.45s …
Loaded data from: input/test.csv | Columns = 15 / 15 | Rows = 9769 -> 9769
Predictions: [’ <=50K’, ’ <=50K’, ’ >50K’, ’ <=50K’, ’ <=50K’]
Evaluation: accuracy on test data: 0.874501
*** Summary of fit() ***
Number of models trained: 18
Types of models trained:
{‘StackerEnsembleModel’, ‘WeightedEnsembleModel’}
Validation performance of individual models: {‘LightGBMClassifier_STACKER_l0/0’: 0.8715481278632303, ‘LightGBMClassifier_STACKER_l0/1’: 0.8732116807002278, ‘LightGBMClassifier_STACKER_l0/2’: 0.8735955775087656, ‘LightGBMClassifier_STACKER_l0/3’: 0.8737491362321808, ‘LightGBMClassifier_STACKER_l0/4’: 0.8710362654518465, ‘LightGBMClassifier_STACKER_l0/5’: 0.8720088040334758, ‘LightGBMClassifier_STACKER_l0/6’: 0.8727254114094132, ‘LightGBMClassifier_STACKER_l0/7’: 0.8723159214803061, ‘weighted_ensemble_k0_l1’: 0.8739794743173035, ‘LightGBMClassifier_STACKER_l1’: 0.8718296521894915, ‘NeuralNetClassifier_STACKER_l1’: 0.8738259155938883, ‘weighted_ensemble_k0_l2’: 0.8738515087144576, ‘LightGBMClassifier_STACKER_l2’: 0.8717272797072148, ‘NeuralNetClassifier_STACKER_l2’: 0.8734676119059197, ‘weighted_ensemble_k0_l3’: 0.8739026949555959, ‘LightGBMClassifier_STACKER_l3’: 0.871087451692985, ‘NeuralNetClassifier_STACKER_l3’: 0.8731860875796585, ‘weighted_ensemble_k0_l4’: 0.8732628669413661}
Best model (based on validation performance): weighted_ensemble_k0_l1
Hyperparameter-tuning used: True
Bagging used: True (with 5 folds)
Stack-ensembling used: True (with 3 levels)

attaching additional images as new users can add 1 image per post

LightGBMClassifier_STACKER_l0_HPOperformanceVStrials