I thought you needed advanced math to build machine learning models, but I was wrong


A lot of people, at least in the pre-“vibe coding era,” lament that they can’t program because they’re “not math people.” I wasn’t either. Here’s how I got started building machine learning models in Python anyway.

Why I hesitated

I thought I “wasn’t a math person.”

Despite my interest in technology and computers, math was a struggle, at least in my formal education. While I managed to pass a transferable introductory statistics course at a community college, I got the feeling that advanced math wasn’t for me. Even though I got interested in Linux and dabbled in code, I still felt inadequate mathematically.

Software programs like Mathematica were out of my price range. While there are more open-source alternatives available now, they didn’t seem to exist when I was in school and college, or at least I wasn’t aware of them.

The simplest code that people learn to write just needs basic arithmetic. I would tinker and dabble with code. Still, my encounter with statistics helped me see math as valuable with real applications.

When machine learning became popular, I thought about trying it. I signed up for a free course on Coursera but was quickly lost without the needed background.

But one day, I felt the urge to try to explore statistical programming. I’d explored math articles on Wikipedia from time to time, but there seemed to be a block on getting involved. But once I did, I found it was easy to do. It’s also given me a reason to self-study math, building my own super calculator in Python and teaching myself calculus and linear algebra as well with Schaum’s Outlines.

Setting up my Python environment

Assembling my modeling toolbox

I needed a few tools that are different from the standard Python environment. A lot of statistics, data science, and machine learning in Python is done interactively. It’s best to explore the data and see what it can tell you rather than jump into building a model.

Histogram of restaurant tips plotted in a Jupyter notebook.

I installed IPython and Jupyter. IPython is an enhanced interactive Python interpreter that adds features like command-line editing and “magic” commands. Jupyter implements interactive notebooks to display results and share them with others. Jupyter notebooks used to be part of IPython, but the developers decided to concentrate on the latter even as IPython is still used in the background as a “kernel.” I use Pixi to install and update these tools.

pandas is a library that manages data in DataFrames. It’s similar to using a spreadsheet or relational database. Seaborn is a library with common statistical visualizations, including bar charts, scatterplots, and regression plots. statsmodels offers classic statistical models like regression. SciPy offers a lot of scientific computing tasks, including common statistical operations.

Exploring the data

You have to know your data to model it

With my environment set up, I now had to explore the data and build a model. First, I had to import my Python libraries:

import numpy as np
import pandas as pd
import seaborn as sns
sns.set_theme()
from scipy import stats
import statsmodels.api as sm
import statsmodels.formula.api as smf
%matplotlib inline

Now I needed some data. Fortunately, Seaborn has some built-in toy datasets. One of them is a dataset from a waiter who tracked the total bill, the tips, the size of the party, and whether there were any smokers in the party in a restaurant over several weekends. Would there be any correlation between the bill and the tip?

First, I’ll load the dataset in using Seaborn:

tips = sns.load_dataset('tips')

This loads the dataset in as a pandas DataFrame.

I’ll examine the first few lines:

tips.head()
Head of a pandas DataFrame of the restaurant tips dataset from Seaborn.

And take some standard descriptive statstics like mean, median, mode, lower quartile (25th percentile), and the upper quartile (50th percentile):

tips.describe()
Descriptive statistics using pandas of a restaurant tips dataset in a Jupyter notebook.

I’ll make a scatterplot with the total bill as the independent variable on the x-axis and the tip as the dependent variable on the y-axis.

sns.relplot(x='total_bill',y='tip',data=tips)
Tip vs bill scatterplot in Seaborn. The total bill is on the x-axis and the tip is on the y-axis. The data appears to show a positive linear relationship.

Hmm, there seems to be a positive linear relationship in this scatterplot. I could draw a line over these dots and the tip would rise along with the total bill. Some tips are higher as outliers, but this relationship seems to generally hold.

I can generate a regression plot with such a line drawn over it:

sns.regplot(x='total_bill',y='tip',data=tips)
Plot of tip vs. total bill in Seaborn using a Jupyter notebook.

From exploration to modeling

Python makes it easy to build regression models

Now that I’ve explored my data and found a positive relationship between the bill and the tip, it’s now time to formally model it. This is easy to do with statsmodels:

results = smf.ols('tip ~ total_bill',data=tips).fit()
results.summary()
Regression result in a Jupyter notebook of a linear regression of restaurant tip vs. total bill using statsmodels in a Jupyter notebook.

Briefly, this creates a formula similar to that in R and displays a summary with the model and some diagnostic information. The most useful bit is the left-hand column of the table. This contains the y-intercept and the slope that you might have learned about in an elementary algebra class. This describes the line that was plotted over the scatterplot. If I were a restaurant manager, I would advise the waitstaff to upsell customers, since they’ll get bigger tips and contribute to the restaurant’s bottom line simultaneously.

This is the simple linear regression taught in Stats 101, but it’s really a kind of machine learning. It’s a supervised algorithm, because you’re fitting data to a known target, the y values in the original dataset.

With this model, I can plug values into it and make predictions. But what about new data? That’s where scikit-learn comes in. This is the premier machine learning library on Python. It can split the data into test and training data, and make predictions. I’ll demonstrate, modifying scikit-learn’s tutorial on regression.

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split,LinearRegression
X = tips[['total_bill']]
y = tips['tip']
X_train, X_test, y_train, y_test = train_test_split(X, y)
model = LinearRegression().fit(X_train, y_train)

I can then predict tips from the “test” dataset:

y_pred = model.predict(X_test)

I’ll plot the regression lines of the training data vs the test data, again modifying the sckit-learn example:

import matplotlib.pyplot as plt

fig, ax = plt.subplots(ncols=2, figsize=(10, 5), sharex=True, sharey=True)

ax[0].scatter(X_train, y_train, label="Train data points")
ax[0].plot(
    X_train,
    model.predict(X_train),
    linewidth=3,
    color="tab:orange",
    label="Model predictions",
)
ax[0].set(xlabel="Feature", ylabel="Target", title="Train set")
ax[0].legend()

ax[1].scatter(X_test, y_test, label="Test data points")
ax[1].plot(X_test, y_pred, linewidth=3, color="tab:orange", label="Model predictions")
ax[1].set(xlabel="Feature", ylabel="Target", title="Test set")
ax[1].legend()

fig.suptitle("Linear Regression")

plt.show()
Tip regression plots on training and test sets plotted side-by-side.

You can see all this code and more on my GitHub account.


Python does the math for me

Python has done the mathematical heavy lifting. This has freed me up to concentrate on things like wondering how valid the model is. And it’s also given me an incentive to study on my own. Modern statistics and machine learning rely heavily on calculus and linear algebra, even though I don’t solve matrix equations or calculate derivatives and integrals directly. I’ve used Python to explore those topics as well, but on my own terms. Armed with Python and my new knowledge, I can explore machine learning even more in the future.



Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews


Remember those moments when a tech giant throws a curveball, only for the underdog to dodge it with style? That’s exactly what just went down with Anything. For those of you unaware, it’s an AI-powered app builder that lets users whip up mobile and web apps using simple text prompts.

Last week, Apple yanked the app from the App Store, citing its usual guideline around code execution and keeping apps “self-contained.” The move felt like part of a broader side-eye toward so-called “vibe coding” tools, where building software is starting to feel as casual as texting a friend.

Apple pulled the app… and Anything got creative

Instead of backing down, the Anything team went full chaos mode, and in a good way. They rebuilt the core experience inside iMessage, effectively turning a messaging app into an app-building tool. Yes, actual app creation… through texts.

BREAKING: Apple is scared of vibe coding

they removed Anything from the App Store so we moved app building to iMessage

good luck removing this one, Apple pic.twitter.com/QrZ2oRk6ha

— Anything (@anything) April 2, 2026

It didn’t just work, it blew up. The workaround went viral, people loved the ingenuity, and the narrative flipped almost instantly. What started as “Apple said no” quickly turned into “wait, this is actually genius.” Memes followed, timelines filled up, and suddenly it felt like Apple had been outplayed at its own game.

And now, just like that, it’s back

Just days later, Apple quietly brought Anything back to the App Store with a few tweaks, but the core idea remains the same: build apps using simple text prompts, preview them instantly, and ship them straight from a phone. The comeback also feels like a subtle shift in momentum. AI is making creation faster, easier, and way more accessible. And when developers can route around restrictions using something as basic as iMessage, it becomes harder to hold that line.

As AI makes creation effortless, even tightly controlled platforms are being forced to adapt. And if this saga proves anything, it’s that creativity will always find a way around the rules.



Source link