Code Like a Girl
Three Coding Interview Formats: And What Each One Actually Tests
I just finished a round of job interviews. LeetCode screens, take-home tasks, vibe coding sessions. The whole spectrum. Here’s what I actually think about each one.
LeetCode: The Secret HandshakeThe theory behind LeetCode is defensible: a standardised problem, a time limit, a controlled environment. Everyone gets the same test. Objective. Comparable.
The practice is something else.
When I sit down to do a LeetCode screen, I don’t feel like I’m being evaluated. I feel like I’m proving I did the homework. It’s a secret handshake: a prerequisite you clear before the real conversation can start, regardless of whether inverting a binary tree has anything to do with the role.
What LeetCode actually measures is preparation. Recent, specific preparation. A senior engineer who’s shipped production systems for a decade can still fail if they haven’t practised the pattern library in the last 90 days. The format rewards the person who did the extra prep, who sat down and worked through this specific class of exercises until the shapes became familiar. If you’re preparing for it, resources like NeetCode 150 are useful because they teach the patterns, not just the answers. That’s the game: recognise the shape quickly enough, then execute under pressure.
If I’m being honest, these interviews give me the same feeling I used to get before school exams. Not fear exactly, but the sense that I’m about to be tested on how well I revised. But I also understand why companies use them. If you have thousands of applicants, LeetCode gives you a fast, standardised filter.
It’s also worth separating at-home LeetCode-style challenges from live ones. At-home challenges are harder to trust now. If the problem is self-contained and the candidate is alone, AI can solve a lot of it. That doesn’t mean everyone is cheating. It does mean the format has to account for it.
Live LeetCode has a different problem. Run badly, it becomes a memory test: have you seen this exact pattern before, and can you reproduce it while someone watches? Run better, it can at least test pattern recognition, communication, and thinking under pressure. That distinction matters.
The companies that rely on it most also have the most applicants. When you can afford high false negatives, you don’t have to fix the screen.
Take-Home Tasks: The Format That Depends Entirely on Who’s Running ItTake-homes start from a different premise: a realistic problem, your own environment, your own tools, no artificial time pressure. See what someone actually builds.
When I helped run interviews, we gave candidates a focused ML task that fit in a couple of hours. The submission wasn’t the verdict; it was the starting point. The follow-up session was where the real signal was. We’d ask about implementation decisions, probe the parts we found interesting, and ask them to make live changes. You find out quickly whether someone understands their own code or just assembled it.
In my opinion, that’s a well-designed take-home, though I might be biased since I helped design it at my previous company. The format depends heavily on how the task is run.
The problems are usually the same. Scope creep: “should take 3–4 hours” becomes a weekend. Asymmetry: the candidate invests serious time with no guarantee of feedback. Gaming: without time pressure, candidates can over-polish the submission until it stops reflecting how they actually work.
There’s also a structural issue worth naming: the ask is unevenly distributed. A take-home is easier to absorb if you’re between jobs, have quiet evenings, or can spend a weekend polishing. It’s much harder if you’re currently employed, caring for someone, or already squeezed for time.
So the same format can reveal very different things depending on how it’s run. Sometimes the signal is engineering judgment. Sometimes it’s mixed with everything around the candidate’s life.
Vibe Coding: Right Direction, Early DaysThe newest format: you write code while thinking aloud, often with AI tools available, with an interviewer who participates rather than just observes. The focus shifts from “did you solve it?” to “how do you think while solving it?”
LeetCode and take-homes are established formats. Vibe coding is not. The edges are still fuzzy: different companies are using the same label for very different interviews.
The two vibe coding interviews I did in this round were very different, which is part of why the format is hard to talk about cleanly.
One was a concrete spec-to-implementation task. I could use my own AI setup, so I used opencode and prompted it to clarify the plan before building. After the code was generated, the interview shifted into review: understand what was built, explain decisions, and compare tradeoffs.
The other was much more abstract: take a loose product prompt and turn it into something working. Again, the useful part was not just asking the model to generate code, but forcing clarification first. By the end, I had a rough prototype in a stack I would not normally choose, followed by a short discussion about what I would do next.
The interesting shift is not that code stops mattering. It is that code alone tells the interviewer less. If the model can generate a plausible implementation, the signal moves to everything around it: how you frame the problem, how you constrain the tool, how you test the output, and whether you understand the tradeoffs.
Traditional live coding puts a lot of weight on retrieval: can you remember the syntax, the pattern, the data structure, the trick? AI changes the value of that memory test. The scarce skill is no longer retrieving the first plausible answer. It is validating whether that answer is right.
Prompting is not just a shortcut here. The prompt becomes part of the signal. A vague prompt usually means vague thinking. A good prompt shows that you can decompose the problem, state constraints, and force clarification before generating code.
AI can make shallow understanding more visible, not less. If you accept the first answer and cannot explain it, the follow-up questions expose that quickly. The tool can generate code, but it cannot give you ownership of the code.
There is one advantage across both versions: shallow understanding is harder to hide. You can’t prepare a polished artifact in advance. You can’t memorise the pattern for this problem type. You have to think in real time, with someone watching how you think.
But it is not neutral. Vibe coding rewards people who have already built the muscle. The developer with the maxed-out subscriptions, a tuned local setup, a library of prompts, and reusable skills for every common task is going to move faster than someone opening the tool for the first time.
I’m not sure that advantage is automatically unfair. Knowing how to use the tools well is becoming part of the job. If someone has learned how to decompose a problem, constrain the model, review its output, and recover when it goes wrong, that is real engineering judgment.
There is a real preparation gap here. Someone who has spent months building prompts, skills, and workflows will look much stronger than someone using the tool cold. That is not obviously different from LeetCode rewarding someone who spent months grinding patterns. The uncomfortable difference is cost: LeetCode practice is mostly free; serious AI-tool fluency often is not. Someone unemployed or earlier in their career may not have the same setup, even if they have the same underlying ability. So the advantage might be signal. It might also be privilege. Most interviews don’t separate those cleanly.
How different companies are running itCompanies haven’t converged on a standard and the variance is wide enough to be its own signal. I looked around at public candidate reports, interview prep posts, and discussions from people who had recently gone through these loops. I wouldn’t treat any of this as official policy, but the patterns are useful.
Shopify is one of the most AI-forward examples I found. According to Hello Interview’s write-up, candidates can expect AI-enabled coding rounds where using AI is part of the exercise, not a violation of it. Catching the model’s mistakes is part of what they’re evaluating.
Meta has also moved in this direction. Hello Interview describes repo-scale tasks with AI available, and 404 Media reported that Meta would let candidates use AI during coding tests. That’s a meaningful statement about what they think the job is now.
One Google-focused community interview guide had the most explicit rubric I found. It frames the work around decomposing problems into modular prompts, encoding constraints before generating anything, validating output rather than accepting it, and diagnosing when a failure is in the prompt vs the code. That’s actually a coherent theory of AI-era engineering judgment.
Stripe has reportedly added an AI programming exercise in HackerRank, with a built-in AI chat window. The reported task is spec-heavy, multi-part, and hard to finish by hand in the time available. The controlled environment matters: candidates get access to the same model and the same tool surface, instead of bringing whatever paid setup they already use. From candidate accounts, the signal is less “can you code this from scratch?” and more “can you read the spec quickly, guide the AI, review its output, add tests, catch edge cases, and explain your reasoning under pressure?”
Amazon has reports pointing in the same direction. In one Reddit thread, a candidate said their recruiter offered an AI-assisted coding round and reimbursement for the tool, reportedly up to $100. Another commenter said they had done the round using Cursor and were reimbursed. That doesn’t mean the whole loop has changed, but it does show how quickly the norms are moving.
The pattern: companies are not moving in one clean direction. Some are building AI into the interview environment. Some are allowing candidates to bring tools. Some are still closer to traditional coding screens. Big tech is fragmenting by role, team, and probably recruiter guidance.
The subjectivity problemThe issue is not that every company needs the same evaluation. They don’t. A company hiring for product prototyping should look for different signals than one hiring for infrastructure work. The problem is when the interview has no explicit criteria at all.
Without criteria, “good vibes” starts doing the assessment. That’s just bias with a friendlier name.
Tooling also complicates the signal. A controlled environment makes access more equal, but it may hide how someone actually works. Letting candidates bring their own setup is more realistic, but it also rewards people who have had the time and money to build one.
A good vibe coding interview makes the competencies explicit upfront. Not “write some code with AI and we’ll see how it goes,” but “we’re watching how you decompose, how you constrain, and how you validate.” The difference between that and the average version is the difference between a signal and a vibe.
What Would Actually WorkA realistic, feature-sized problem. Not algorithmic puzzles, but rather something closer to actual day-to-day work. Done collaboratively, with the interviewer as a participant. Time-boxed, but not in a way that creates artificial panic. Followed by a conversation about the decisions, not just the code.
That’s a well-run vibe coding interview with a clear rubric. It captures what the other formats are reaching for, without most of the distortions.
The harder truth: this doesn’t scale easily, which is why most companies won’t do it. And scale tends to win over signal in hiring.
What’s been your experience across these formats?
♦Three Coding Interview Formats: And What Each One Actually Tests was originally published in Code Like A Girl on Medium, where people are continuing the conversation by highlighting and responding to this story.