How to Use This Page
This page gives you a peek under the hood of the model I used for this project. It provides a detailed overview of the XGBoost model powering the Titanic survival predictions. You'll find the model's configuration, the features it considers, and how it made its predictions.
Executive Summary
Below is a 100-foot view of the model and the dataset.
Model
XGBoostFeatures
13Version
v1.0Trees Built
100Tree Depth
3Learning Rate
0.1What This Model Does
This machine learning model predicts whether a passenger on the Titanic survived based on their personal characteristics and travel details. It's a binary classifier, meaning it makes yes/no predictions. In the case of the Titanic dataset, it predicted if passengers survived.
Features the Model Considers
The model analyzes 13 different passenger characteristics to make predictions:
Personal Information
- Age: Passenger's age and age group
- Sex: Gender of the passenger
- Title: Social title (Mr., Mrs., Dr, etc.)
Ticket Details
- Passenger Class: 1st, 2nd, or 3rd class
- Fare: Ticket price and price category
- Cabin: Whether passenger had a cabin
- Embarked: Port embarked from
Family Information
- Family Size: Total family members aboard
- Siblings/Spouse: Number aboard
- Parents/Children: Number aboard
- Is Alone: Traveling solo or not
How the Model Works
The Extreme Gradient Boosting (XGBoost) model works like a committee of decision-makers to return optimal predictions, following a set of prescribed steps:
- It creates 100 decision trees to tackle the prediction task.
- Each tree can ask up to 3 questions about a passenger.
- The model learns at a rate of 0.1. A slower learning rate improves accuracy.
- All trees 'vote' on the final prediction, with later trees correcting earlier trees' mistakes.
Download Output Data
If you need to use the input and output data to build a dashboard using a different tool, you can download the data.
Super nerdy details 🤓
Feature Names
[ "Pclass", "Sex_Encoded", "Age", "SibSp", "Parch", "Fare", "Embarked_Encoded", "FamilySize", "IsAlone", "Title_Encoded", "HasCabin", "FareBin_Encoded", "AgeGroup_Encoded" ]
Model Parameters
{
"objective": "binary:logistic",
"base_score": null,
"booster": null,
"callbacks": null,
"colsample_bylevel": null,
"colsample_bynode": null,
"colsample_bytree": 1.0,
"device": null,
"early_stopping_rounds": null,
"enable_categorical": true,
"eval_metric": "logloss",
"feature_types": null,
"feature_weights": null,
"gamma": null,
"grow_policy": null,
"importance_type": null,
"interaction_constraints": null,
"learning_rate": 0.1,
"max_bin": null,
"max_cat_threshold": null,
"max_cat_to_onehot": null,
"max_delta_step": null,
"max_depth": 3,
"max_leaves": null,
"min_child_weight": null,
"missing": NaN,
"monotone_constraints": null,
"multi_strategy": null,
"n_estimators": 100,
"n_jobs": -1,
"num_parallel_tree": null,
"random_state": 42,
"reg_alpha": null,
"reg_lambda": null,
"sampling_method": null,
"scale_pos_weight": 1.0,
"subsample": 1.0,
"tree_method": null,
"validate_parameters": null,
"verbosity": null
}