AI and ML Fundamentals

Learn the key concepts and terminology of artificial intelligence and machine learning to effectively document these technologies.

Now that we’ve packed our bags with curiosity, let’s step into the world of Artificial Intelligence (AI) and Machine Learning (ML).

You don’t need a PhD in math to understand these concepts. Just bring your imagination, a few metaphors, and maybe a cup of coffee.

AI vs ML Relationship
AI is the big umbrella, ML is the clever part inside, and deep learning is its overachieving cousin.


What Is AI? What Is ML? And Why Should You Care?

Let’s break it down as simply as possible.

Artificial Intelligence (AI)

Think of AI as the big idea: teaching machines to behave intelligently—like playing chess, recognizing your face in a photo, or making movie recommendations.

Machine Learning (ML)

ML is how machines learn. Instead of being programmed with strict rules, they look at data and learn from patterns. Like how you get better at riding a bicycle by practicing.

Deep Learning

Deep learning is a subset of ML. It uses neural networks with many layers to learn complex things—like understanding spoken language or recognizing animals in photos.


Types of Machine Learning: How Machines Learn

Learning Types

Supervised Learning

This is like learning with a teacher. You show the algorithm examples with answers, and it learns to make predictions.

Example: Teach it the difference between cats and dogs using labeled images.

Common algorithms:

  • Linear and Logistic Regression
  • Support Vector Machines
  • Decision Trees and Random Forests
  • Neural Networks

Unsupervised Learning

Here, the machine learns without labeled data. It finds patterns on its own.

Example: Grouping customers based on purchasing habits without telling the algorithm what the groups should be.

Common algorithms:

  • K-means Clustering
  • Hierarchical Clustering
  • Principal Component Analysis
  • Association Rules

Reinforcement Learning

The algorithm learns by doing, receiving rewards or penalties. It’s trial and error at scale.

Example: Learning to play a video game by winning or losing points based on actions.

Common algorithms:

  • Q-Learning
  • Deep Q Networks (DQN)
  • Policy Gradient Methods
  • Actor-Critic Methods

The Machine Learning Pipeline

Think of this as the recipe for building an ML solution.

ML Pipeline

1. Data Collection and Preparation

Start with gathering data. It might come from:

  • Databases
  • Sensors or devices
  • User interactions
  • Web scraping
  • Public datasets

Then clean the data:

  • Handle missing values
  • Remove duplicates
  • Fix incorrect data types
  • Normalize values

2. Exploratory Data Analysis (EDA)

Understand your data using:

  • Statistical summaries
  • Visualizations
  • Correlation checks
  • Outlier detection

3. Feature Engineering

Features are what the model uses to make decisions:

  • Create new features
  • Scale or normalize data
  • Encode text or categories
  • Select the most relevant features

4. Model Training and Selection

Pick the right algorithm for the job, then:

  • Split data into training and validation sets
  • Train the model
  • Tune parameters (hyperparameters)
  • Use cross-validation to test stability

5. Evaluation

Use metrics to see how well the model works.

Task Type Metric Description When to Use Range
Classification Accuracy Percentage of correct predictions When classes are balanced
Precision When model says "yes," how often it's right When false positives are costly
Recall When actual is "yes," how often model predicts it When false negatives are costly
F1 Score Harmonic mean of precision and recall When balance between precision and recall is needed
Regression MAE Mean Absolute Error - average absolute differences When outliers should not have extra influence Lower is better
MSE Mean Squared Error - average of squared differences When larger errors should be penalized more Lower is better
RMSE Root Mean Squared Error - square root of MSE When result should be in same units as target Lower is better
Coefficient of determination - variance explained When need to compare performance across datasets

For classification:

  • Accuracy
  • Precision and Recall
  • F1 Score
  • AUC-ROC

For regression:

  • Mean Absolute Error (MAE)
  • Mean Squared Error (MSE)
  • Root Mean Squared Error (RMSE)
  • R-squared

6. Deployment

Put the model into production:

  • Expose it via an API
  • Integrate it into apps
  • Set up monitoring

7. Monitoring and Maintenance

Even after deployment, your model needs care:

  • Watch performance
  • Retrain as new data arrives
  • Handle concept drift

Common Machine Learning Pitfalls

Common Pitfalls

Overfitting

The model memorizes the training data too well and fails on new data.

Fix it by:

  • Simplifying the model
  • Adding more data
  • Using regularization

Underfitting

The model is too simple and fails to learn from the data.

Fix it by:

  • Using a more complex model
  • Adding more useful features
  • Allowing more training time

Data Leakage

The model accidentally uses information it shouldn’t have during training.

Fix it by:

  • Carefully splitting data
  • Avoiding future info
  • Designing features responsibly

Class Imbalance

When one class dominates, the model might ignore the smaller class.

Fix it by:

  • Resampling techniques
  • Generating synthetic data
  • Using appropriate evaluation metrics

Comparing Algorithms

Algorithm Comparison

Algorithm Strengths Weaknesses Use Cases Complexity
Linear Regression
  • Simple to implement
  • Highly interpretable
  • Computationally efficient
  • Only captures linear relationships
  • Sensitive to outliers
  • Assumes independence of features
  • Price prediction
  • Sales forecasting
  • Risk assessment
Low
Logistic Regression
  • Outputs probabilities
  • Resistant to overfitting
  • Easily interpretable coefficients
  • Only linear decision boundaries
  • Limited to binary/categorical outcomes
  • Requires feature engineering
  • Spam detection
  • Credit approval
  • Disease diagnosis
Low
Decision Trees
  • Visually intuitive
  • Handles numerical and categorical data
  • Requires minimal preprocessing
  • Prone to overfitting
  • Can create biased trees if classes unbalanced
  • Unstable (small changes → different trees)
  • Medical diagnosis
  • Customer churn prediction
  • Fault diagnosis
Medium
Random Forest
  • Robust against overfitting
  • Handles large datasets well
  • Provides feature importance
  • Less interpretable than single trees
  • Computationally intensive
  • Biased toward categorical variables with many levels
  • Customer segmentation
  • Fraud detection
  • Market prediction
Medium
SVM
  • Effective in high-dimensional spaces
  • Memory efficient
  • Works well with clear margins of separation
  • Slower for large datasets
  • Sensitive to feature scaling
  • Parameter tuning can be difficult
  • Text classification
  • Image recognition
  • Gene classification
Medium-High
Neural Networks
  • Captures complex non-linear patterns
  • Highly adaptable to different data types
  • State-of-the-art performance in many domains
  • Requires large amounts of data
  • Computationally expensive to train
  • Black box with limited interpretability
  • Image/speech recognition
  • Natural language processing
  • Generative AI
High

Neural Networks Explained

Neural Network Anatomy

Neural networks are inspired by the human brain, but they work on numbers.

Structure

  • Input Layer: Takes in raw data (like pixels or numbers)
  • Hidden Layers: Process information and find patterns
  • Output Layer: Gives the final prediction
  • Neurons: Each node that does some math
  • Weights: Control how strongly inputs influence outputs

How Training Works

  1. Data goes forward through the layers (forward pass)
  2. Compare prediction to actual result (loss)
  3. Send feedback backward to update weights (backpropagation)
  4. Repeat until it gets better

Common Types

  • Convolutional Neural Networks (CNNs): Best for images
  • Recurrent Neural Networks (RNNs): Great for sequences and time
  • Transformers: Power modern language models
  • Generative Adversarial Networks (GANs): Create new content
Network Type Architecture Ideal For Famous Examples Complexity
Convolutional Neural Networks
(CNNs)
Convolutional layers that detect spatial patterns at different scales
  • Image recognition
  • Object detection
  • Computer vision tasks
  • ResNet
  • VGG
  • Inception
Medium
Recurrent Neural Networks
(RNNs)
Loops that allow information to persist across sequence steps
  • Time series data
  • Text processing
  • Speech recognition
  • LSTM
  • GRU
  • Bidirectional RNNs
Medium-High
Transformers Attention mechanisms to process whole sequences in parallel
  • Natural language processing
  • Text generation
  • Language translation
  • BERT
  • GPT
  • T5
High
Generative Adversarial Networks
(GANs)
Two networks (generator and discriminator) competing against each other
  • Image generation
  • Image-to-image translation
  • Data augmentation
  • StyleGAN
  • CycleGAN
  • Pix2Pix
High

The Limits of AI

Data Dependency

Garbage in, garbage out. Poor data = poor model.

Lack of Understanding

AI doesn’t actually “know” things—it just recognizes patterns.

Black Box Models

Deep models are hard to explain, even to experts.

Fragile in New Situations

Trained for summer? It might fail in winter unless retrained.

High Resource Demand

Training big models requires massive computing power.


Ethical Considerations in AI

AI isn’t just technical—it’s deeply human.

Bias and Fairness

Biased training data leads to biased outcomes.

Privacy

Machine learning often needs personal data. This raises red flags.

Transparency

People should understand decisions that affect them.

Accountability

When AI fails, who takes the blame?

Environmental Impact

Training large models contributes to carbon emissions.


What This Means for Documentation

Understanding these fundamentals helps you:

Provide Technical Depth

  • Create layered content
  • Use analogies and visuals

Communicate Limitations Clearly

  • Be transparent about edge cases
  • Manage user expectations

Track Change and Versioning

  • Show how models evolve
  • Explain what’s different between versions

Write Ethically

  • Mention fairness, privacy, and data origin
  • Include known risks and how they are handled

Exercise: Identify the ML Approach

Pick an application and answer:

  1. What type of learning is used?
  2. Is it classification, regression, clustering, etc.?
  3. What kind of data does it need?
  4. What might be hard to explain in the documentation?

Examples:

  • Spam filter
  • Product recommendation
  • Fraud detection
  • Customer segmentation
  • Self-driving car
  • Stock price prediction

Want to Learn More?

Books

  • “The Hundred-Page Machine Learning Book” by Andriy Burkov
  • “Interpretable Machine Learning” by Christoph Molnar
  • “Artificial Intelligence: A Guide for Thinking Humans” by Melanie Mitchell

Free Courses

Helpful Resources


What’s Coming Up Next?

Next, we’ll explore how to explain AI and ML systems to different types of readers—developers, decision-makers, and everyday users.
You’ll learn how to adjust your writing style, use the right tone, and build clarity even for the most complex concepts.

Let’s move from understanding AI to helping others understand it.