In this example we use VerbaGPT to create multiple models to predict the progression of diabetes from the famous diabetes dataset1.
Video:
User question: Give me 3 best models to predict the iris flower. Please rank their performance, and include confusion matrix as at least one measure of performance.
VerbaGPT response:
Cost of query: $0.00
Decision Tree R2 score: 1.000 Random Forest R2 score: 0.921 Linear Regression R2 score: 0.518
VerbaGPT answers the question correctly and gives us exactly what we asked for. It is kind of incredible that we are at a point where we can ask a question in plain english, and in seconds have this statistical cake prepared, recipe and and all, that tells us all kinds of interesting things and gives us an incredible head-start in our analysis.
What a time to be alive!
Check out the full VerbaGPT read-out by click on below.
VerbaGPT Message Window
Here is a Python code snippet that imports the necessary libraries and uses three different models to predict the target variable in the diabetes dataset. The models are ranked in terms of R2 performance: import pandas as pd from sklearn.linear_model import LinearRegression from sklearn.tree import DecisionTreeRegressor from sklearn.ensemble import RandomForestRegressor from sklearn.metrics import r2_score # Load data from SQL database query = "SELECT * FROM diabetes" data = pd.read_sql(query, conn) # Split data into features and target variable X = data.drop('target', axis=1) y = data['target'] # Create and fit models lr = LinearRegression() lr.fit(X, y) dt = DecisionTreeRegressor() dt.fit(X, y) rf = RandomForestRegressor() rf.fit(X, y) # Make predictions and calculate R2 scores lr_pred = lr.predict(X) lr_r2 = r2_score(y, lr_pred) dt_pred = dt.predict(X) dt_r2 = r2_score(y, dt_pred) rf_pred = rf.predict(X) rf_r2 = r2_score(y, rf_pred) # Rank models by R2 score models = [('Linear Regression', lr_r2), ('Decision Tree', dt_r2), ('Random Forest', rf_r2)] models.sort(key=lambda x: x[1], reverse=True) # Print results for model in models: print(f'{model[0]} R2 score: {model[1]:.3f}') This code uses the `LinearRegression`, `DecisionTreeRegressor`, and `RandomForestRegressor` models from the `sklearn` library to make predictions on the diabetes dataset. The R2 scores for each model are calculated using the `r2_score` function from the `sklearn.metrics` module. The models are then ranked by their R2 scores and the results are printed.
Leave a Reply