What I Learned Building with OpenClaw, Codex, and AI Coding Agents

Context

I started using AI coding agents because I wanted workflow acceleration that showed up in real output, not just faster drafts. The goal was to move from one-off prompting to a repeatable engineering system: define the work, let an agent inspect the codebase, make scoped changes, verify the result, and leave a useful trail for the next pass.

OpenClaw became useful as the coordination layer because it can connect chat surfaces, local files, browser automation, coding agents, and durable workspace notes. Codex became useful as the implementation partner because it can read the repo, edit files, run checks, and report back with concrete evidence.

The practical role OpenClaw played

In my setup, OpenClaw acted less like a code generator and more like a workflow shell around chat, repo context, browser checks, screenshots, durable notes, and approval gates.

The tradeoff is that agentic workflows amplify both good process and weak process. If the task is vague, the agent can produce a lot of plausible work that still misses the actual acceptance bar.

The workflow I started with

My early prompts were broad. I would describe a desired outcome, mention the rough visual direction, and expect the agent to infer the rest. That worked for small improvements, especially copy edits, CSS cleanup, simple page structure, and mechanical refactors.

I learned quickly that broad prompts are weak for multi-page frontend work. They leave too much room for ambiguous completion. A page can be technically updated but still fail visually, have broken links, hide important content on mobile, or miss the specific reviewer feedback that motivated the work.

What broke down

The main failure mode was treating a direction as if it were an acceptance test. "Make this better" can produce motion without closure. "Add a blog index, a post page, a learning log, responsive checks, link checks, privacy review, and stop before commit" creates a much clearer finish line.

I also found that status updates need evidence behind them. A confident note saying the work is in progress is not the same as a screenshot, a link check, a diff summary, and a list of remaining blockers. Agent work needs fewer vibes and more artifacts.

The improved workflow

The pattern that worked was smaller and more explicit:

Read the existing code and design system first.
Convert the request into concrete acceptance criteria.
Make the smallest set of changes that satisfies the criteria.
Run local checks before claiming completion.
Use screenshots for frontend quality, not just HTML inspection.
Show the diff and ask for approval before commit or publish.

That workflow is slower than a single broad prompt, but it is faster than cleaning up a confident wrong answer. The useful acceleration comes from giving the agent a bounded operating box and then verifying the result like normal engineering work.

Patterns that worked

I learned to ask for agent work in slices: layout pass, copy pass, CSS pass, screenshot pass, link-check pass, and regression pass. Those slices map well to how engineers already review frontend changes.

I also found that reusable prompts should include what not to do. For example: do not auto-publish, do not expose private identifiers, do not include internal URLs, do not commit without approval, and do not treat a passing build as visual acceptance.

The pattern that worked best was "draft, verify, review." The agent can produce the draft and run the verification loop, but the human remains the final gate for taste, positioning, private context, and public judgment.

Visual verification loop

Visual verification changed the quality of the frontend work. Static HTML can be valid while the page still feels wrong: cards can misalign, diagrams can collapse, navigation can overflow, and text can become cramped on mobile.

Playwright screenshots made that visible. I found it useful to check the homepage, the index page for the new feature, and the deepest content page across desktop and mobile. For this site, that means checking the portfolio home, the Engineering Notes index, and the first post page.

Responsive checks are not just a QA nicety. They are part of the design process. If a layout requires heroic CSS to avoid horizontal overflow, the structure is probably too fragile.

Mini-case from this site

For this portfolio, the blog route looked complete in HTML before it was proven visually. The useful pass was narrower: check the homepage, the Engineering Notes index, and the post page on desktop and mobile; verify navigation, CTAs, link targets, and horizontal overflow; then publish only after the evidence existed.

Prompting patterns for coding agents

The strongest prompts I have used include five things:

Goal: what user-facing outcome should exist.
Scope: which files, routes, or modules are fair game.
Design direction: enough constraints to match the current product.
Acceptance checks: exact validation steps the agent must run.
Approval gate: what must not happen without human review.

For layout, I ask for stable dimensions, mobile checks, no horizontal overflow, and consistency with the existing visual language. For copy, I ask for the tone directly: practical, specific, calm, and senior. For CSS, I ask the agent to reuse existing patterns before inventing new ones. For regression checks, I ask for internal links, CTAs, screenshots, and privacy review.

What I would not delegate

The agent can draft layouts, improve copy, adjust CSS, run screenshots, find broken links, and prepare regression checks. I would not delegate final taste, public positioning, privacy judgment, employer-sensitive content review, architecture tradeoffs, or the final acceptance decision.

The tradeoff is simple: agents are strong at producing and checking artifacts, but humans still own judgment. That is especially true when a page represents a professional identity, not just a feature branch.

Safety and privacy guardrails

Agentic systems can see more local context than a normal web editor. That makes privacy and review gates non-negotiable. I do not want every useful learning automatically turned into public content. Some lessons should stay in a private repo log until they are abstracted into public-safe language.

The guardrail I will keep using is a two-step workflow: capture the learning privately, then draft a public update separately. The draft needs a diff, a public-safety review, and explicit human approval before it becomes part of the live site.

I also separate independent project notes from employer-related work. If a lesson came from professional work, it should be abstracted into a general engineering pattern and should not claim to represent an official company view.

Review-first publishing loop

One follow-up pattern I will keep using is separating the engineering trail from the public artifact. The private trail can include raw notes, draft lessons, QA screenshots, critique output, and deployment checks. The public page should include only the parts that have been reviewed, sanitized, and approved.

That separation matters because a static site can make supporting files feel harmless until they become reachable through the public web surface. The acceptance step is not just "the page deployed." It is: the intended public routes work, private routes do not, assets are fresh, the page is readable on desktop and mobile, and the update is safe to represent publicly.

The tradeoff is a little more process around publishing, but it is the right process for public professional writing. The agent can help collect evidence, draft updates, and run checks. The human still owns the decision to publish.

Changelog

Added a review-first publishing note: capture raw lessons privately, draft public-safe updates separately, and require human approval before publish.
Clarified that live route checks, private-route denial checks, responsive screenshots, and safety review are separate acceptance criteria after deployment.
Kept private repo details, artifact paths, operational details, and raw critique transcripts out of the public post.

What I will improve next

The next improvement is making the learning loop more durable. A private learning log should capture the date, context, what changed, what worked, what failed, the reusable pattern, and whether the lesson deserves a blog update. That keeps the raw learning close to the work without pushing every note into public writing.

I also want the agent workflow to produce better review packets by default: files changed, diff summary, validation results, screenshots, link-check output, and privacy review. That is the difference between an impressive demo and an engineering workflow I can trust repeatedly.

Reusable checklist

Define the route, page, or component being changed.
Write explicit acceptance criteria before implementation starts.
Keep the task small enough to review in one diff.
Run local validation and internal link checks.
Capture desktop and mobile screenshots for visual work.
Check for horizontal overflow on mobile.
Verify CTAs still work.
Review public content for private data, internal URLs, IDs, credentials, and screenshots.
Show the diff before commit, push, or publish.
Keep human review as the final gate.