How much coding experience do I need to train an AI writing assistant?

You'll need basic Python knowledge and familiarity with machine learning concepts. However, the step-by-step Google Colab approach described in this article makes the process accessible even for technical writers without extensive programming experience. The most important requirement is understanding your documentation requirements and having quality training data.

What kind of hardware is required to train a documentation AI model?

You don't need specialized hardware since the training can be done using Google Colab's free GPU resources. This cloud-based approach eliminates the need for expensive local GPUs. For more complex models or larger datasets, you might consider Colab Pro or other cloud computing options with more powerful GPUs.

How long does it take to train an AI model for technical documentation?

Training time varies based on dataset size and model complexity. Fine-tuning a pre-trained model like FLAN-T5 with a few thousand documentation examples typically takes 2-5 hours on Google Colab's free tier. The initial dataset preparation and cleaning is often more time-consuming than the actual model training.

Can an AI writing assistant completely replace human technical writers?

No. AI writing assistants are powerful tools for improving consistency, catching common issues, and enforcing style guides, but they lack the contextual understanding, domain expertise, and creative problem-solving abilities of human technical writers. These AI tools work best as collaborative assistants that help writers produce better documentation faster, not as replacements.

How do I ensure my AI writing assistant doesn't introduce errors into documentation?

Implement a human-in-the-loop approach where the AI makes suggestions but humans review them before implementation. Thoroughly test the model with diverse documentation samples before deployment, maintain a comprehensive evaluation dataset, and continuously retrain the model with examples of both correct and incorrect outputs to improve accuracy over time.

Building an AI Technical Writing Assistant - Transforming Documentation with Machine Learning

This project actually emerged from challenges I faced while creating my Advanced API Documentation and AI/ML Technical Writing courses. After spending months teaching technical writers how to document complex systems, I noticed we all shared common writing patterns that could be improved with the right tools. The writing assistant you'll learn about here is the direct result of those observations.

Ever wish you had an extra pair of eyes (that never get tired) to review your technical documents? As a documentation specialist, I certainly have. Typos slip through, passive voice sneaks in, and style guide rules get overlooked when you’re juggling tight deadlines and complex AI systems.

AI writing assistant helping with technical documentation

AI writing assistant catching common documentation issues while you focus on explaining complex concepts

I embarked on a mission to train an AI for technical writing – essentially, a writing assistant who catches mistakes and enforces style guidelines. In this blog, I’ll share how I’m building this AI sidekick to fix the top 20 writing mistakes we, as technical writers, make when documenting AI/ML systems—even while following industry style guides like IBM’s, Microsoft’s MSTP, and Google’s Developer Documentation Style Guide.

Plot twist: I'm teaching AI to edit documentation about AI. If this causes a documentation singularity and your manuals become sentient, I apologize in advance. But hey, at least they'll have perfect grammar.

So grab a cup of coffee and let’s dive into the fascinating intersection of AI technology and technical writing!

Common writing mistakes in AI/ML documentation

Every technical writer has a few gremlins in their documentation. Here are some common writing mistakes that persist in AI/ML documentation, even when we know better:

1. Passive voice overuse

It’s easy to slip into “The model was trained by the data scientist” instead of the clearer active form “The data scientist trained the model.” Passive voice makes sentences less direct and can confuse readers about who’s doing what. Both MSTP and Google’s style guides recommend active voice whenever possible, but old habits die hard—like that one colleague who still uses two spaces after periods and secretly hoards WordPerfect installation disks.

2. Overly complex sentences

We’re often explaining complex AI concepts, but that doesn’t mean our sentences should be complex too. Long, winding sentences packed with clauses (and maybe a parenthetical statement or two, just like this one!) can often be split into shorter, clearer ones. If your sentence has more layers than a neural network designed to recognize layers in other neural networks, it might be time to simplify.

Complex sentence warning

If you need to refill your lung capacity halfway through reading your sentence about backpropagation, it's probably too long.

3. Inconsistent terminology and style

One minute it’s “Machine Learning Model”, the next it’s “machine learning model” in lowercase – oops. Consistency is critical when documenting AI systems. Technical style guides insist on consistent capitalization, terminology, and formatting (for example, using code font for model.predict() or bolding UI labels like Train Model). Your documentation shouldn’t have multiple personalities, unless you’re specifically documenting a multi-agent AI system.

Style guide metrics showing improvement in documentation quality with AI assistance

The dramatic improvement in documentation quality when using an AI writing assistant

4. Ambiguous references

Words like “it”, “this”, or “above” can confuse readers if it’s not crystal clear what they refer to. Ever read a document where “as mentioned below” made you scroll around like you’re searching for hidden treasure in a poorly designed video game? Yeah, we try to avoid that. Your readers shouldn’t need a map and compass to navigate your documentation.

5. Unnecessary jargon or formality

Using “utilize” instead of “use”, or “in order to” instead of “to” when describing AI workflows. These little things add up and bog down understanding. Modern style guides encourage a conversational, straightforward tone—basically, write like you speak (but maybe with fewer “ums” and “likes”). Your audience wants documentation, not a Victorian-era dissertation on the computational properties of artificial neural pathways.

Stuffy jargon	Clear alternative
Utilize the hyperparameter optimization function	Use the hyperparameter optimizer
In order to facilitate the initialization of the model	To initialize the model
It is recommended that practitioners implement regularization	We recommend using regularization
The utilization of tensor operations enables computational efficiency	Using tensor operations makes computation faster
Subsequent to preprocessing the dataset, model training can commence	After preprocessing the dataset, train the model

These are just a few of the frequent offenders. In fact, there are about 20 common issues I keep encountering in AI/ML documentation. As a conscientious writer, I strive to catch them all—like a documentation Pokemon master, but with less excitement about finding a rare passive-voice construction in the wild.

"Passive Voice, I choose you!" said no technical writer ever. Although if documentation errors were Pokemon, passive voice would definitely be a common spawn with high resistance to editing.

But let’s face it: when you’re deep in documenting a new neural architecture or a complex ML workflow, it’s easy to become blind to your own mistakes. I can’t count how many times I’ve reviewed my draft for the fifth time and still missed a glaring mistake sentence about gradient descent—which itself is ironically similar to how neural networks learn from their own mistakes.

Why we need an AI assistant for technical writing

The idea of using AI for technical writing might sound meta (AI documenting AI?), but it boils down to a simple goal: help writers create better documentation for AI systems. After identifying these common pitfalls, the next logical step was clear—what if we could train an AI to catch these issues automatically? Here’s why an AI writing assistant is the perfect sidekick for technical writers in the AI/ML space:

Quick poll: Your biggest documentation challenge

1. It never gets tired

Unlike us, an AI doesn’t need coffee or sleep. It can scan through pages of complex machine learning documentation in seconds, tirelessly flagging every small error or deviation from the style guide. No more “oops, I missed that explanation of backpropagation on page 5” moments. The AI will catch it, even if it’s buried in paragraph 37 of your technical appendix where no human editor has ventured in years.

2. Consistency and objectivity

An AI tool can apply the style rules consistently every single time. It won’t have off days. If the style guide says “Always use sentence case for headings,” it will remind you every time you accidentally Title Case Something. It’s like having a built-in style guide enforcer who never gets lenient, even on Friday afternoons when human editors are daydreaming about weekend plans instead of checking your capitalization patterns.

AI training paradigms illustration showing supervised, transfer, and reinforcement learning approaches

The training approaches that make our AI documentation assistants possible

3. Focus on higher-level writing

If the AI catches the little things (like punctuation, voice, capitalization), we writers can focus on the harder part – explaining complex AI architectures clearly, structuring the document, making sure the technical content is accurate and accessible. Basically, the AI handles the mechanics, we handle the message and meaning. It’s like having a proofreading intern who never complains about doing the tedious parts of documentation.

Human-AI documentation partnership

AI: "I noticed you used passive voice in section 3.2 about transformer models."

Me: "Thanks! I was busy trying to figure out how to explain attention mechanisms without making readers contemplate a career change."

4. Learn and adapt

Modern AI isn’t a rigid set of rules; it’s a learned model. That means it can be trained to understand context. For example, “cache” and “cash” are very different in a tech doc, and a well-trained model will know the difference between model caching and model monetization strategies. Over time, as we fine-tune it with more examples and feedback, the AI’s suggestions can get even better, unlike my human editor who still hasn’t learned that I’ll never get affect/effect right on the first try.

Now, you might ask: don’t tools like Grammarly or MS Word’s editor already do some of this? Yes, to an extent. They catch general grammar and style issues. But they’re general-purpose. They aren’t tuned to the very specific needs of AI/ML documentation writers.

When I tried using a general grammar checker on my neural network documentation, it suggested I replace "tensor" with "tenser" because it thought I was describing how tense something was. The resulting documentation would have read like a bad thriller novel about stressed-out matrices.

Google’s Developer Style Guide might say “Don’t use future tense for describing machine learning behavior”, or MSTP might have a rule about not saying “please” before every instruction. General tools might not catch those nuances.

I want an AI that’s custom-trained on tech writing style guides specific to AI/ML documentation. Plus, let’s admit it: there’s a cool factor in having an AI sidekick helping you document other AI systems. It’s like having a specialized assistant for technical writers, pointing out our documentation flaws without the awkward human interactions that come with peer reviews.

Training the AI to enforce style guides: My approach

So, how do you train an AI to become a technical writing editor specialized in AI/ML documentation? It’s been an exciting journey of coding, data gathering, and a fair share of debugging. Let me walk you through my approach:

How I trained the model

Step 1: Build a modular dataset

Instead of one monster dataset, I created small, focused files:

passive_voice.json
contractions.json
minimalism.json
long_sentences.json
And 16 more to cover all 20+ style issues

Each entry looks like this:

{
  "prompt": "Rewrite to active voice: The system was updated by the admin.",
  "output": "The admin updated the system."
}
        

This modular approach makes everything beautiful, reusable, and much easier to test.

Step 2: Train the model (Colab-style, step-by-step)

Here's exactly how I did it, without skipping any steps:

1. Open Google Colab

Go to https://colab.research.google.com
Click on + New Notebook in the bottom-right corner
Rename it to something like "train-passive-voice-model"

2. Install the required libraries

!pip uninstall -y wandb
!pip install transformers datasets accelerate -q
import os
os.environ["WANDB_DISABLED"] = "true"  # Prevents weight & biases logging
          

3. Upload your dataset

from google.colab import files
uploaded = files.upload()
          

Upload your file named passive_voice_converted.json (your training data).

4. Load the dataset

import json
with open("passive_voice_converted.json") as f:
    raw_data = json.load(f)

from datasets import Dataset
hf_dataset = Dataset.from_list(raw_data).train_test_split(test_size=0.1)
          

5. Load the model & tokenizer

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
          

6. Tokenize the dataset

def preprocess(example):
    inputs = tokenizer(example["prompt"], padding="max_length", truncation=True, max_length=128)
    targets = tokenizer(example["output"], padding="max_length", truncation=True, max_length=128)
    inputs["labels"] = targets["input_ids"]
    return inputs

encoded = hf_dataset.map(preprocess, batched=True)
          

7. Set up the trainer

from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments

args = Seq2SeqTrainingArguments(
    output_dir="./flan-t5-passive-model",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    num_train_epochs=3,
    weight_decay=0.01,
    save_total_limit=1,
    logging_dir="./logs",
    logging_steps=10,
    push_to_hub=False,
    report_to=None  # Important to avoid unnecessary logging
)

trainer = Seq2SeqTrainer(
    model=model,
    args=args,
    train_dataset=encoded["train"],
    eval_dataset=encoded["test"],
    tokenizer=tokenizer
)
          

8. Train it!

trainer.train()

Sit back. Grab some tea. Stare meaningfully at your screen like it's compiling wisdom. In about 15 minutes, it's done.

9. Save and download the model

trainer.save_model("/content/flan-t5-passive-model")
tokenizer.save_pretrained("/content/flan-t5-passive-model")
!zip -r flan-t5-passive-model.zip /content/flan-t5-passive-model
from google.colab import files
files.download("flan-t5-passive-model.zip")
          

And now you own your very own writing assistant model. Next step? Put it to work!

Training on Colab was surprisingly smooth (after I uninstalled wandb to avoid runtime errors):

Upload dataset
Tokenize it
Load FLAN-T5-Base
Train with Seq2SeqTrainer
Save model

Pro tip: Always add report_to=None to your training arguments to avoid unnecessary logging systems.

Every training run felt like assembling furniture. Simple steps, but one wrong parameter and you're rebuilding from scratch.

Step 3: Test the model

After training, I ran it through test sentences. It worked beautifully:

Correctly rewrote passive voice
Understood contractions
Even simplified complex phrases with style

It was like having a mini-editor in my browser who didn't ask for coffee or ever sleep.

Step 4: What comes next

Now that I have the core engine:

I'm building a Gradio interface where users can paste or upload documents
It will scan paragraphs, detect issues, and suggest rewrites
All while following Microsoft and Google style guides, plus common sense writing principles

Coming soon: style-checking for PDF, DOCX, and MD files with ease.

Why this matters

If you're a technical writer, editor, content strategist, or just someone who writes things others read:

This project will help you tighten your writing
You'll spot style violations before your reviewer does
It's open source, so you can train it on your team's custom rules

And yes, it won't replace writers. But it will make our lives easier and give us more time to focus on complex explanations and creative content.

Want to try it?

I'll be releasing the first model (trained on passive voice) and the tool soon. If you want early access or want to contribute datasets for your writing pet peeves, find me on GitHub.

Let's make writing clearer, one sentence at a time.

1. Defining the task clearly

First, I needed the AI to know what I want. The task isn’t just “write something new” (like ChatGPT does). It’s correcting and improving existing text according to specific rules for AI documentation. This is a text-to-text transformation problem.

For each sentence or snippet with an issue, the AI should output a revised version that fixes the issue (while keeping the meaning the same). For example:

Input: “The neural network architecture was designed by the research team.”
Output: “The research team designed the neural network architecture.” (active voice fix)

2. Building a dataset of mistakes and corrections

Since I needed the model to learn, I had to feed it examples. I combed through old AI documentation, style guide examples, and my own experience to create a list of sentences with mistakes and their corrected versions.

Think of it as a before-and-after pair for each of those top 20 mistakes. In some cases, I intentionally wrote a bad sentence (violating a rule) and then wrote the “good” version following the style guide. I felt slightly guilty creating deliberately bad text, like a chef purposely overcooking pasta to show what not to do.

Try rewriting this

"The solution utilizes a transformer architecture for the processing of natural language input."

For example:

For passive voice: “The data was processed by the algorithm.” → “The algorithm processed the data.”
For terminology: “Click on the Train button.” → “Choose Train.” (Google’s guide says avoid “click on”)
For jargon: “The solution utilizes a transformer architecture.” → “The solution uses a transformer architecture.”

I ended up with dozens of such pairs for each category of mistake. Small dataset, but very targeted to AI/ML documentation. Quality over quantity—much like tech writing itself.

3. Choosing the right AI model

I decided to fine-tune a transformer-based language model for this task. (Transformers are the tech behind GPT-3, ChatGPT, and many modern AI systems we document.)

Transformer models pop quiz

What makes transformers so powerful for language tasks?

They can process sequential data in parallel

They're really good at shapeshifting

They run exclusively on caffeinated electricity

Since I’m doing this solo and on a budget, I opted for an open-source model that I could train on my modest hardware – a small T5 model from Hugging Face’s library. T5 is great at text generation tasks, and by training it on my pairs, it can learn to output the corrected sentence when given the flawed one. Think of it as teaching a very specialized grammar tutor who only knows about AI documentation.

4. Fine-tuning (teaching the AI)

Using the Hugging Face Transformers library, I set up a training pipeline. I fed in my input-output pairs so the model could adjust its weights (its “knowledge”) to map bad writing to good writing.

Training an AI model is like teaching a very literal toddler who occasionally throws tantrums in the form of runtime errors. Except this toddler consumes several gigabytes of RAM and makes my laptop fan sound like a small aircraft.

This stage was both fun and frustrating. Fun, because it’s like watching a child learn; frustrating, because like a child, the model made some hilarious mistakes early on.

At one point, it would sometimes output the word “Corrected:” at the start of every sentence because I had that in some prompt during testing. Whoops! I had to refine my training prompts and data formatting. The AI was being too literal, much like when you accidentally format your document with an extra space at the beginning of each line and then can never get rid of it.

And yes, I confess, I had a classic programmer moment of forgetting to define the tokenizer (the part that breaks sentences into tokens the model understands). That threw a nice error midway through training and I spent an evening debugging why nothing was working. Debugging AI models is like trying to teach a cat to fetch—theoretically possible, but expect a lot of confused staring and the occasional hairball of incomprehensible error messages.

But after fixing that (and a few other hiccups), the fine-tuning was complete.

5. Testing the AI on example sentences

With a trained model in hand, I wanted to see it in action. I wrote a small script (using the model in a pipeline for text generation) to feed it new sentences and get corrections.

The results were encouraging! Here’s an example of a quick test I ran in code:

# Example: converting a passive voice sentence to active voice using the trained model
text = "The neural architecture was updated by the system automatically."
prompt = f"Rewrite to active voice: {text}"
result = corrector(prompt)  # 'corrector' is our fine-tuned model pipeline
print("AI suggestion:", result[0]['generated_text'])

Model output simulator

AI suggestion: The system automatically updated the neural architecture.

In this case, the AI correctly spotted the passive construction and flipped it around. I may or may not have done a little happy dance the first time it worked—that will remain undocumented.

Similarly, it learned to suggest removing filler words. If I give: “In order to effectively utilize the transformer model, you should first initialize it.”, it suggests something like: “To effectively use the transformer model, first initialize it.”

Boom: shorter and clearer, just like a good tech editor would do. The AI isn’t perfect (yet), but seeing these fixes roll out felt like magic. It’s like having MS Word’s grammar checker, but tailored to the specific rules of AI/ML documentation and without those squiggly green lines that make you question your entire writing career.

6. Iteration and improvement

Training an AI model is not a one-and-done deal. I’m iterating on the model – expanding my dataset with more examples from AI/ML documentation, fine-tuning it further, and refining its prompts.

AI technical writer metrics dashboardPassive voice detection
92% accuracy
Jargon reduction
85% accuracy
Consistency maintenance
78% accuracy
Coffee savings
∞ cups

For instance, I realized it helps to give the model a little nudge by phrasing the input like a command (as I did with “Rewrite to active voice:”). It sets the context for what kind of edit is needed.

Eventually, the goal is that I won’t even need to specify the rule; the AI should detect the issue from the sentence itself and fix it. But as a starting point, these explicit prompts help in testing specific corrections for AI documentation.

Oh, and did I mention I’m doing this entire project solo? It’s a labor of love, powered by late-night coding and perhaps a few cups of coffee on my end (just to keep up with the no-coffee-needed machine).

Me, 3 AM, talking to my computer: "No, no, the passive voice is 'was updated BY something', not just any sentence with 'was' in it! Why is teaching AI easier than explaining this to some humans? At least the AI doesn't argue back about how its favorite professor always let it use passive voice."

The plan is to make this project completely open source once it’s polished. I strongly believe that the technical writing community documenting AI/ML could benefit from it, and also contribute to it. Imagine a collaborative AI editor that gets better as writers around the world feed it more examples of do’s and don’ts for AI documentation! It would be like a global hive mind of documentation expertise, but without the awkward team meetings.

Results so far and what’s next

As of now, the AI writing assistant I’m training is showing promising results in a controlled environment (i.e., my laptop). It reliably fixes many of the classic issues in sample sentences from AI/ML documentation.

Original text	AI-corrected version
"Note: Please ensure you save your model weights."	"Note: Save your model weights."
"The data scientist can utilize the library for creating neural networks."	"The data scientist can use the library to create neural networks."
"It is recommended to initialize the model prior to training it."	"We recommend initializing the model before training."
"The neural architecture was designed by our team to optimize performance."	"Our team designed the neural architecture to optimize performance."
"In the event that an error occurs, logging will be performed automatically."	"If an error occurs, the system logs it automatically."

I’ve tested it on some of my older blog drafts about neural networks and it caught a bunch of things I’d missed back then. Talk about a time-traveling editor! If only it could go back and fix all my college papers too.

For example, it flagged a sentence where I wrote “Note: Please ensure you save your model weights.” and suggested removing the unnecessary “Please” to match a more matter-of-fact style. That made me grin, because it’s exactly what I’d point out to someone else but had missed in my own writing—a classic case of documentation blindness.

However, there’s still a lot of work ahead before this AI becomes a truly world-class technical writing assistant for AI/ML documentation. Here’s what’s in the pipeline (no pun intended):

1. Broader testing

I plan to test the model on real-world AI documentation snippets from open source projects or public docs (the ones that allow such use). This will tell me how well the AI generalizes beyond my curated examples.

Real AI docs can be messy, and I want to ensure the AI doesn’t give weird suggestions out of context when faced with complex technical explanations. It’s one thing to correct “utilize” to “use” in a simple sentence, but quite another to maintain the technical accuracy of a paragraph explaining transformer attention mechanisms.

2. More style guide rules

I’ve so far focused on the big ticket items (voice, clarity, terminology for AI systems). Next, I want to expand the AI’s knowledge to cover more of the nitty-gritty rules from MSTP and Google’s guide.

Things like how to phrase headings for ML tutorials, use of lists vs. tables for model parameters, capitalization of AI product names, etc. There’s a long tail of guidelines specific to AI/ML documentation that could be encoded, much like there’s a long tail of lint in my dryer that somehow never gets captured no matter how many times I clean the filter.

3. Integration and UI

Ultimately, I’d love to integrate this into a handy tool – maybe a simple web app or a plugin for popular editors (VS Code, Google Docs, etc.). That way, writers can use it in their actual workflow when documenting AI systems.

I might need to rope in a friend or two for help on the UI side (my front-end skills are as rusty as documentation for a deprecated neural network model from 2012).

4. Open sourcing

I mentioned this will be open source. I’m prepping the code, documentation (oh yes, I’m documenting the documentation assistant!), and examples so that I can put it on GitHub.

Coming soon to GitHub: "Documentation-ception: How I documented my documentation tool that documents AI used for documentation." I'm considering expanding this into a full documentary: "Document This! The untold story of technical writing tools."

My hope is that other technical writers and developers working on AI/ML will try it out, give feedback, and even contribute improvements. Maybe it could evolve into a community-driven tool, with writers contributing new examples of mistakes as they encounter them in AI documentation.

Through this project, I’ve gained a deeper appreciation that writing about AI is both an art and a science. The art is in communicating complex technical ideas, and the science (or rather, the craft) is in polishing the language. An AI might never replace the artistic side (and I wouldn’t want it to), but it can certainly master the craft side with enough training.

A caffeine-free writing sidekick for AI documentation

Working on this AI assistant has been an eye-opening adventure. It started as a whimsical idea — “what if an AI could be a technical writer’s best friend for documenting AI, minus the coffee addiction?” — and it’s steadily becoming a reality. The journey from identifying common documentation issues to building a solution has taught me as much about effective writing as it has about AI development.

I’m excited (and a bit nervous) to share this project with the world. If you’re a technical writer or developer who documents AI systems, I hope this tool can eventually make your writing process smoother and your docs clearer.

Call to action

If this project piques your interest, keep an eye out on my GitHub in the coming weeks – I’ll be open-sourcing the code and model so you can try this AI writing assistant for yourself.

I’d love for you to take it for a spin, break it, improve it, and help it learn. After all, the best way to train an AI to help writers document AI is to have more writers train the AI! It’s so meta it hurts—like documenting the documentation process for a documentation tool.

Learning together

This tool's journey reflects my own learning path in technical writing. I originally documented these patterns while developing materials for my technical writing courses. If you're looking to develop your skills further, you might enjoy exploring:

The principles of clear API documentation that minimize confusion
Techniques for explaining AI/ML concepts to different audience types

These are topics I'm passionate about and continually exploring with the technical writing community.

Thank you for joining me on this journey. With a lot of coding and community support, I believe we can make AI/ML documentation a bit easier and more enjoyable for everyone.