Part four, and the last one of this experiment. We've chosen to shine a spotlight on 7 of the 20 projects, although we would have loved to talk about every single one of them...

Squash

1st place, 32.85/40 CG Director. 5–10 hours. Fully playable.

The Game: Cannonballs arc into the arena. Shadows show exactly where they'll land. You can shove your friends directly into the blast zone. It’s a last-man-standing party game for up to eight players, featuring two modes, power-ups, camera shake, full audio, mobile support, and a leaderboard between rounds. It came first overall for a very simple reason: it's the kind of game you understand in ten seconds and immediately want to play for an hour.

The Human Behind It: Built by our CG Director, who approached the project less like a frantic brainstorming session and more like a structured production. He set up a strict working document that Claude could operate against consistently. Every single prompt had a crystal-clear brief, and every output was immediately checked against the master plan. His main takeaway was simple: "If you do not have a plan, things will break very quickly."

The Catch: While the AI handled the bulk of the heavy lifting, the project still required a reality check when it came to the laws of mathematics. The one moment that needed genuine, manual human intervention was getting Claude to understand barrel angle geometry on the cannon models. Everything else held together beautifully, proving that AI is a brilliant execution tool - provided you have a human steering the ship who actually knows where it's supposed to land.

Who's the Shoo?

2nd place, 31.39/40 Lead Sound Designer. 20+ hours. Minor issues.

The Game: A social deduction game set in a supermarket. One or two hidden players, the "Shoo”, try to terrorise customers and drain the store’s satisfaction gauge, while the employee players rush around completing retail tasks and voting each evening to eliminate the impostor. It features three special roles, a ghost mode, a surveillance terminal in the breakroom, and a full HR-process voting system that is, in context, quite funny.

The Human Behind It: The creator of this project had never written a single line of code before the Koffeejam. In fact, he openly listed coding, 3D art, and UI/UX design as his weakest areas going in. Despite that, he scored an 8.71/10 on visuals - the highest in the entire group.

He managed this by building a remarkably disciplined system around the AI. He used a shared Notion document as his game design bible, ran a structured validation loop after every single change, and maintained context across prompting sessions through what he called "breadcrumb files."

The Catch: His biggest takeaway was that the real skill isn't the prompting itself - it's designing the structured workflow that the AI operates in. The game even features a companion robot that reacts to player actions with dozens of personality-driven, witty quips. That wasn't a random AI output; that was a human writer who finally had a tool capable of keeping up with his ideas.

Koffeeball

7th place, 30.02/40 QA. 5–10 hours. Prototype.

The Game: A physics-based marble runner spanning four levels. It features spinning disc obstacles with hand-tuned ball interactions, moving platforms, and a proper HUD. Essentially, it's everything you need to test whether a digital ball will behave itself when gravity is involved.

The Human Behind It: Built by one of our QA team, who came into the project expecting a bit of a wrestle with the initial setup- and wasn't disappointed. Claude wasn't particularly helpful during the "getting things to actually turn on" phase, but once the foundation was there, they settled into a decent rhythm.

His summary of the partnership was remarkably honest: "I came to expect an error margin of about 50% of the time. Claude is good at coding and changing values—it's less effective at understanding game design and debugging."

The Catch: His core lesson from the sprint? "Test your changes constantly." Which is a beautifully QA answer, given that it’s exactly what he does for a living anyway. The habit simply transferred to the machine. He walked away with a grounded 6/10 satisfaction score - the mark of someone who knows exactly what "finished" should look like and recognizes precisely where the AI hit its limits.

Planetescape

8th place, 29.97/40 Lead Developer. 10–20 hours. Minor issues.

The Game: Mine crystals. Haul them back to base. Buy rocket parts. Try not to get your stuff stolen by the other players while waiting for your defensive turrets to finish building. It’s a race to launch first, mostly involving a lot of frantic logistics and mid-run robbery.

The Human Behind It: Our Lead Developer wasn’t a stranger to AI, but he wasn’t living in it either. He spent about 15 hours feeding prompts to Claude and emerged with a flawless 10/10 satisfaction score, a finished game, and a slightly rearranged worldview. In his own words, his daily workflow has completely changed: "I'm using AI daily currently in almost all my tasks."

His summary of the experience? "Easy, simple, annoying." (Which is about as honest as tech feedback gets.)

The Catch: The experiment proved that time-consuming tasks and rapid prototyping aren't the hurdles they used to be. But genuinely understanding how powerful the tool is comes with a side of healthy realism. He built exactly what he set out to build, loves the tool, and uses it every day- but he's also looking at the future of the job with a much sharper, more critical eye. That’s not a contradiction; that's just what happens when you actually test the limits.

Don't Be IT

16th place, 26.27/40 Lead Game Designer. 5–10 hours. Fully playable.

The Game: Every second you're IT costs you HP. -Everyone has a rocket launcher. Every tag triggers a massive explosion, and at the end of a three-minute round, the healthiest player wins. It’s loud, fast, and features physics-based rocket jumps that work exactly the way they should.

The Human Behind It: Built by our Lead Game Designer, who started the project with absolutely zero Roblox knowledge and finished a complete, working game loop in just ten hours. It featured full round management, tag mechanics, and an automatic winner announcement. He walked away with a 9/10 satisfaction score, mostly because he managed to take a basic idea and turn it into something that made the room genuinely laugh within a couple of minutes.

The Catch: The most interesting finding here wasn't the speed of the build, but how the experience shifted his perspective on the technology. A game designer’s job is to translate a feeling into a testable brief and articulate an experience before it exists. Doing exactly that - but at a speed he’d never had access to before - made any anxiety about the tool effectively disappear.

As he put it: "The ability to do quick and focused iteration loops is invaluable to improve a game's quality. AI allows that at a scale different than what used to exist before." It turns out that when you use the tool to supercharge your design loops, it feels less like a threat and much more like a massive competitive advantage.

Prop Hunt

17th place, 25.52/40 UX Intern. 5–10 hours. Minor issues.

The Game: Eight players. Six hiders, two seekers. You have exactly one minute to pick a prop, find a spot, and blend into the environment (or switch props mid-round if you’re feeling particularly brave). Two hits from a seeker and you’re out.

The Human Behind It: Built by one of our UX, who ran into a proper battle with the engine's physics. Throughout the project, the camera controls and player movement aggressively fought with each other. Instead of panicking, she treated the technical mess with strict design discipline: rigorous version control, immediately binning any change that made things worse, and building each component in total isolation.

Her core philosophy was delightfully practical: "Start with the smallest working components -then attach functionality."

The Catch: What’s fascinating here is that this is a pure UX principle, simply cross-trained and applied directly to game development. While the physics didn't fully resolve by the deadline, the structured workflow kept the project alive. It turns out that when the code starts misbehaving, a solid design framework is exactly what stops you from throwing your computer out the window.

Floaty Lands

18th place, 23.70/40 Developer. 10–20 hours. Minor issues.

The Game: Crafting. Resource gathering. Rotating themed island worlds, each featuring unique enemies and rewards. It boasts a five-stage progression loop, co-op PvE, a bit of PvP, and a legendary final boss. It was easily the most ambitious game of the entire group, and remarkably, most of it actually shipped.

The Human Behind It: Built by a developer who pushed the experiment right to the absolute limit. While the code, progression, and world-rotation logic held together, the project ran into a very specific wall when it came to animation. The AI simply couldn’t generate the movement fluidity required, and it shows slightly at the edges of the experience.

The Catch: His biggest takeaway was all about boundaries: "Plan the structure first, then ask AI to follow the plan." The sprint didn't just yield a working prototype; it gave him a very clear map of exactly what the tool can carry and what it can't. He’s already changed how he works on a daily basis, because knowing precisely where a collaborator is going to let you down is a very useful thing to know before you start.

The Pattern, Named

Twenty people. Twenty games at entirely different levels of completion, varying degrees of polish, and wildly different final scores.

But as the dust settled on the KOFFEEJAM, we noticed a distinct pattern. What separated the participant who came out most satisfied - and most clear-eyed about the technology - wasn't their technical experience. It wasn't the hours they logged, and it wasn't even their prior knowledge of AI.

It was simply having a clear enough idea that the AI actually had something real to work with.

The creators who arrived with a tight brief, a defined scope, and the stubborn habit of checking every single output built games that felt finished. Those who let the scope breathe built things that were incredibly ambitious but, in some cases, much harder to complete cleanly. Both approaches produced something fascinating. But only one consistently produced satisfaction.

AI is fast. Incredibly fast. But that speed makes it dangerously easy to sprint straight into complex territory that you can't easily finish.
The KOFFEEJAM taught us - quietly, through twenty games+ and 300+ studio votes - that the most valuable thing you can bring to an AI collaboration isn't technical fluency.

It’s knowing exactly what you want to make. The rest, it turns out, is remarkably negotiable.

The Aftermath

All twenty games from the experiment are still up and running on Roblox. If you happen to have seven friends and a competitive streak, we’d highly recommend starting with “Squash”. Or, if you prefer accusing your colleagues of being a mischievous retail creature, give “Who's the Shoo?” a go.

That wraps up our first KOFFEEJAM series. We have plenty more experiments currently cooking in the background - and we’ll write about them the moment they do something weird enough to be worth it.