I tested them on a real coding challenge and one dominated

You’ve probably seen the meme where a guy opens all the popular AI chatbots in different browser tabs, gives them the same coding prompt, checks the output of each, and then copies the best one. For a moment, I thought I’d do the same experiment. So, I chose three of the most popular AIs and gave them the same problem to solve. Here’s how each one performed.

Choosing a suitable problem and judging criteria

Not too easy, not too challenging, getting that sweet middle spot

For this test, I was brainstorming what kind of coding challenge I could pick. I didn’t want to choose something like “find duplicates from a list” because most AIs would approach it the same way. I also didn’t want to try a LeetCode-style problem because that would go over the heads of many.

Besides, I also wanted to test with something relatable to most people, has some sort of real-world functionality, and has no definite answer. In particular, having no definite answer makes it more interesting since I could observe the thinking and creativity level of each AI.

With that, the problem I finally decided on was “password strength checker.” You usually see these in action when you sign up for a service. Upon creating a password, the system often shows you how strong your password is and what you need to do to make it stronger.

I can use this program to test how well each AI designs the solutions, their coding style, if they follow best practices, user-friendliness, and if the solutions are industry-standard. The scores will be based on these criteria.

Designing the prompt

Observing how much guidance each chatbot needs to reach its best solution

An AI chatbot sitting at a desk at home.

With the challenge problem taken care of, it’s now time to think of the prompting techniques. To keep things fair, I will give ChatGPT, Gemini, and Claude the same first prompt. Their first output will be the biggest deciding factor for scoring. After the first prompt, I may give a few follow-up prompts, such as explaining the solution, handling edge cases, refactoring the code, etc., depending on their output.

Here’s the first prompt:

Write a Python function that checks the strength of a password and labels it as Weak, Medium, or Strong.

Explain your logic clearly for a beginner.

I kept it fairly short and simple. This is to see how much the AI tool can think on its own without me giving any explicit instructions.

ChatGPT

Clean code, average explanation

For the challenge, I used the regular free web version of ChatGPT. The model in use was GPT 5.2. Here’s the solution I received:

import re

def check_password_strength(password):
    length = len(password)
    
    # Conditions
    has_lower = re.search(r"[a-z]", password)
    has_upper = re.search(r"[A-Z]", password)
    has_digit = re.search(r"[0-9]", password)
    has_special = re.search(r"[!@#$%^&*(),.?\":{}|<>]", password)
    
    score = 0
    
    # Scoring rules
    if length >= 8:
        score += 1
    if has_lower:
        score += 1
    if has_upper:
        score += 1
    if has_digit:
        score += 1
    if has_special:
        score += 1
    
    # Strength levels
    if score <= 2:
        return "Weak"
    elif score == 3 or score == 4:
        return "Medium"
    else:
        return "Strong"

ChatGPT used Python’s re module for character searching, which is interesting. For a beginner level, though, I expected it to use general Python logic instead of regular expressions. ChatGPT did ask at the end of its response whether I wanted a version using basic Python instead of regex. Otherwise, the code looks really clean and understandable, even for a beginner.

The comments added to the code are basic, not explaining what it’s doing. Just describing the section. The explanation ChatGPT provided after the code section was okay. It went section by section, explaining what each part was doing. But it wasn’t top-notch.

Lastly, the solution was kept short by providing the Python function I asked for. It’s not a full-fledged program that asks for input, gives feedback on the password (like a missing number), or even prints any message, making it a bit less user-friendly and far from production-grade code. The only other thing ChatGPT provided was some test lines:

print(check_password_strength("abc"))          # Weak
print(check_password_strength("abc12345"))     # Medium
print(check_password_strength("Abc@12345"))    # Strong

Gemini

Weak system, beginner-friendly code and explanation

For Gemini, I went with the free web version with the Gemini 3 Fast model. Here’s Gemini’s solution:

def check_password_strength(password):
    # Initialize our score
    score = 0
    length = len(password)
    
    # 1. Check Length
    if length >= 8:
        score += 1
        
    # 2. Check for Uppercase letters
    has_upper = any(char.isupper() for char in password)
    if has_upper:
        score += 1
        
    # 3. Check for Numbers
    has_digit = any(char.isdigit() for char in password)
    if has_digit:
        score += 1

    # Final Labeling Logic
    if length < 5:
        return "Very Weak (Too short!)"
    
    if score == 3 and length >= 10:
        return "Strong"
    elif score >= 2:
        return "Medium"
    else:
        return "Weak"

# Testing the function
print(f"'12345': {check_password_strength('12345')}")
print(f"'Password123': {check_password_strength('Password123')}")
print(f"'qwerty': {check_password_strength('qwerty')}")

Gemini went for a Pythonic approach instead of regex, using generator expressions, built-in methods, and the any() function. The comments in the code are more descriptive, explaining what’s happening in each section. The explanation section after the code was also a good upgrade. It explained each section and even line clearly.

However, unlike ChatGPT’s code, it’s less clean. It would’ve looked better if the checks were grouped together and the scoring conditions were grouped together. But Gemini put the check and the related scoring together, making the code less readable.

The biggest problem with Gemini’s code is its scoring system. There’s no check for lowercase letters, no check for special characters, making the program unfit for real-world usage. Even a weak password will be shown as strong in this system.

Claude

The clear winner by miles of difference

Coming down to our last contestant, Claude. For this, I used the free web version, selecting the Sonnet 4.6 model. Here’s the output:

import re

def check_password_strength(password):
    """
    Checks the strength of a password and returns a label:
    'Weak', 'Medium', or 'Strong'.
    """
    score = 0
    feedback = []

    # --- Rule 1: Length ---
    # Longer passwords are exponentially harder to crack.
    if len(password) >= 12:
        score += 2          # Big bonus for long passwords
    elif len(password) >= 8:
        score += 1          # Minimum acceptable length
    else:
        feedback.append("Use at least 8 characters.")

    # --- Rule 2: Uppercase letters (A-Z) ---
    # Mixing cases increases the number of possible combinations.
    if re.search(r'[A-Z]', password):
        score += 1
    else:
        feedback.append("Add at least one uppercase letter.")

    # --- Rule 3: Lowercase letters (a-z) ---
    if re.search(r'[a-z]', password):
        score += 1
    else:
        feedback.append("Add at least one lowercase letter.")

    # --- Rule 4: Digits (0-9) ---
    if re.search(r'[0-9]', password):
        score += 1
    else:
        feedback.append("Add at least one number.")

    # --- Rule 5: Special characters ---
    # Symbols dramatically expand the character set an attacker must guess from.
    if re.search(r'[!@#$%^&*(),.?":{}|<>]', password):
        score += 2          # Big bonus — special chars are very effective
    else:
        feedback.append("Add at least one special character (e.g. !, @, #).")

    # --- Scoring ---
    # Max possible score = 7 (2+1+1+1+2)
    if score >= 6:
        label = "Strong 💪"
    elif score >= 4:
        label = "Medium ⚠️"
    else:
        label = "Weak ❌"

    return label, score, feedback


# ── Demo ──────────────────────────────────────────────────────────────────────
test_passwords = [
    "abc",                    # Too short, nothing good
    "password123",            # Common, no symbols, no uppercase
    "Hello@2024",             # Good mix, decent length
    "X!9kP#mQ2$vL",          # Long, all character types
]

for pwd in test_passwords:
    label, score, tips = check_password_strength(pwd)
    print(f"\nPassword : {pwd}")
    print(f"Strength : {label}  (score: {score}/7)")
    if tips:
        print("Tips     :", " | ".join(tips))

Right off the bat, the code looks professional. Claude also used regex for the checks. It has a well-thought-out scoring system in place.

It started the function with a docstring explaining what it does, which is very common in real-world coding. The comments added are descriptive, making the code part itself self-explanatory. Even if I didn’t go to the explanation section, I’d still know what the code is doing here. Though I must say, the comments made the code less clean.

For the scoring system, Claude covered all the common scenarios, just like ChatGPT. One distinct difference is Claude’s bonus point for long passwords and adding special characters, because they make your password much stronger. It also added a demo section for testing passwords with different strengths for your convenience.

However, what makes Claude’s solution the most elegant is the feedback part. For each check, if your password misses it, then Claude uses a feedback list to add suggestions on what you need to do. Honestly, I was expecting this from the other two AI bots, but was let down.

To be fair, though, Claude’s explanation section was a bit weak compared to Gemini and ChatGPT. It focused more on explaining the system and how it works than explaining the code itself. However, the comments are quite helpful, making it sufficient for the gap in the explanation, thus earning Claude our winner for this challenge.

Not all AIs have the same coding capabilities

This was a fun experiment highlighting how each AI bot understood the coding challenge, processed it, and implemented the solution. All of their approaches had good sides and bad. This really makes you think about the future of AI and coding.

Source link

Stephan Dorsey

Stephan is the sports journalist for the Maple Grove Report.

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Motorola used to make the best phones—what happened?

Chinese spy posed as researcher in spear-phishing campaign targeting NASA to steal defense software

April 27, 2026

LINKEDIN BROWSERGATE

April 27, 2026

The Safest Ways to Enjoy Gaming in 2026

April 27, 2026

Tired of serials? These 5 Netflix shows let you watch in whatever order you want

Chinese spy posed as researcher in spear-phishing campaign targeting NASA to steal defense software

April 27, 2026

LINKEDIN BROWSERGATE

April 27, 2026

The Safest Ways to Enjoy Gaming in 2026

April 27, 2026

This $165K track car does what million-dollar prototypes do

Chinese spy posed as researcher in spear-phishing campaign targeting NASA to steal defense software

April 27, 2026

LINKEDIN BROWSERGATE

April 27, 2026

The Safest Ways to Enjoy Gaming in 2026

April 27, 2026

Recent Reviews

Tired of serials? These 5 Netflix shows let you watch in whatever order you want

Serials have become the backbone of the streaming era, especially on Netflix. Serialized television is when a show’s plot unfolds in sequential order over the course of a season. It’s long-form storytelling that typically works best with dramas—Stranger Things, The Crown, etc. Watching the episodes in release order matters. Often, these shows are binged because the complex character arcs and cliffhangers encourage streaming multiple episodes at once.

Serial shows can feel like homework, especially when you fall behind on an episode and need to catch up. That always happens to me, and it leads to anxiety I didn’t want. Thankfully, Netflix offers shows where viewers can jump at any time and not feel lost. These episodic series are perfect for jumping around and picking the episodes you want to watch. One of the most famous comedies ever fits the criteria of an episodic sitcom. Anthology shows, including a Netflix sci-fi classic, are also ideal for watching episodes out of order.

Black Mirror

Welcome to your worst nightmare

Black Mirror wants to scare you. Charlie Brooker’s sci-fi anthology series has been warning humanity about the dangers of technology since 2011. It seems like ages ago that Rory Kinnear had sexual intercourse with a pig in the first episode. Apologies for the spoiler, but the media’s role in the spread of misinformation has never been more relevant.

Black Mirror features self-contained episodes with a beginning, middle, and an end. There has only been one direct sequel: USS Callister: Into Infinity, a season 7 episode that continues the events of season 4’s USS Callister. Otherwise, feel free to jump around and check out the best episodes of each season. Since most episodes feature bleak endings, I’ll leave you with one that ends on an upbeat note: San Junipero.

Seinfeld

Greatest comedy ever?

Comedies are the perfect vehicle for episodic storytelling. While having an overarching plot throughout a season helps attract viewers, many comedy fans are just looking for a few laughs. Write a self-contained story with numerous jokes over 20 to 30 minutes, and you’re ready to go. Seinfeld, aka the show about nothing, is the ideal escape from serialized dramas.

Seinfeld stars Jerry Seinfeld as a fictionalized version of himself as he navigates the comedic scene in New York City. The show revolves around Jerry’s interactions with his friends George (Jason Alexander), Elaine (Julia Louis-Dreyfus), and Kramer (Michael Richards). The gang faces a problem, hilarity ensues, and the episode ends. That’s really all you need to know. Enjoy the laughs.

Guillermo del Toro’s Cabinet of Curiosities

The genre maestro curates new horror stories

There’s a reason why Guillermo del Toro is considered the “King of the Monsters.” The genre expert is as elite as it comes when dealing with mythology and creating new worlds. The Oscar winner relied on his horror expertise in the anthology series Guillermo del Toro’s Cabinet of Curiosities.

I hate referring to episodes of television as “mini-movies.” However, that’s how I would describe the eight episodes of Cabinet of Curiosities. Each director puts their own signature style on a story and brings audiences into their terrifying creation. Del Toro wrote two of the episodes, including one about a demon being summoned. Some are scarier than others, but horror fans will feel right at home with this series.

Beat Bobby Flay

Bobby brings the heat

As I’ve gotten older, the Food Network has become one of my favorite channels. I mean, who doesn’t love food? I love eating my (average) home-cooked meal while watching contestants duke it out in the kitchen on my favorite show, Beat Bobby Flay. The competition breaks down into two rounds. In the first round, two chefs have 20 minutes to construct a meal using a secret ingredient. The winner advances to the main event, where they face off against Bobby Flay.

The challenger gets to pick the dish for the final round, so Bobby has a disadvantage. However, Bobby is an award-winning chef with a few tricks up his sleeves. He can handle making a version of your grandmother’s lasagna. With episodes available on Netflix, be prepared to learn why Bobby always throws chiles into his dishes.

S.W.A.T.

Broadcast TV still knows how to make entertaining programs

The procedural is a genre best produced on broadcast television. Name a cop, doctor, or law drama—chances are it’s a procedural on broadcast TV. While the way we watch television has changed, people still love these types of shows on CBS, NBC, Fox, and ABC. Law & Order, NCIS, and Criminal Minds are procedurals that gained a bigger following thanks to streaming.

S.W.A.T. is cut from the same cloth as Chicago P.D. and CSI. Sergeant Daniel “Hondo” Harrelson (Shemar Moore) is tasked with leading a new S.W.A.T. unit in the LAPD. This action-packed show utilizes a “case of the week” formula in which the team must solve a dangerous situation, such as active shooters and hostage situations. You’re in and out in 44 minutes. What’s better than that?

Netflix has more content coming your way

After you’re done watching these shows, stay on Netflix for more top-notch content. Netflix has an entire section dedicated to thrillers, and this week, The Guilty and El Camino are two of the section’s best. Keep an eye out for new movies, like Alan Ritchson’s War Machine, which is currently in the streamer’s top 10.