Skip to main content

Validation tools

The Public APIs repository uses automated validation scripts to ensure quality, consistency, and accuracy of all API entries. These tools run automatically when you submit a Pull Request.

Overview

The validation process checks:
  • Link validity - All URLs are accessible and working
  • Format compliance - Entries follow the correct structure
  • Alphabetical ordering - APIs are sorted correctly within categories
  • Duplicate detection - No duplicate links exist
  • Content standards - Descriptions, auth types, and other fields meet requirements

Purpose

The link validation script (scripts/validate/links.py) ensures that all API links in the directory are:
  • Accessible and returning valid responses
  • Not returning error codes (4xx, 5xx)
  • Not duplicate entries
  • Free from SSL/connection errors

What it checks

  • Scans all links in the README
  • Identifies duplicate URLs
  • Prevents the same API from being listed multiple times
  • Case-insensitive comparison
Exit condition: If duplicates are found, the check fails and lists all duplicates.
  • Tests each URL with HTTP GET request
  • Uses randomized User-Agent headers to avoid blocking
  • Sets proper host headers
  • Timeout limit: 25 seconds per link
Checks for:
Error TypeCodeDescription
Client errorERR:CLTHTTP 4xx responses (except Cloudflare protection)
SSL errorERR:SSLSSL certificate or security errors
Connection errorERR:CNTUnable to establish connection
Timeout errorERR:TMORequest took longer than 25 seconds
Redirect errorERR:TMRToo many redirects
Unknown errorERR:UKNOther unexpected errors

Cloudflare protection handling

The validator intelligently handles Cloudflare-protected sites:
  • Detects Cloudflare DDoS protection (403, 503 responses)
  • Identifies Cloudflare challenge pages
  • Allows these links to pass validation
  • Prevents false positives from security measures
Cloudflare indicators detected:
  • “Security check” pages
  • “Checking your browser” messages
  • Cloudflare Ray ID patterns
  • Challenge tokens and redirects

How to run

cd scripts
python validate/links.py ../README.md
This will:
  1. Check for duplicate links
  2. Validate all links are working

Check only duplicates

cd scripts
python validate/links.py ../README.md --only_duplicate_links_checker
# or
python validate/links.py ../README.md -odlc
This will only check for duplicate links without validating accessibility.

Understanding output

Success:
Checking for duplicate links...
No duplicate links.
Checking if 1234 links are working...
Duplicate found:
Checking for duplicate links...
Found duplicate links:
https://example.com/api
https://duplicate.com/docs
Broken links:
Apparently 3 links are not working properly. See in:
ERR:CLT: 404 : https://example.com/api
ERR:TMO: https://slow-site.com
ERR:SSL: [SSL: CERTIFICATE_VERIFY_FAILED] : https://bad-ssl.com

Format validation

Purpose

The format validation script (scripts/validate/format.py) ensures that all entries follow the standardized structure and meet quality requirements.

What it checks

Alphabetical order

  • Verifies APIs are sorted alphabetically within each category
  • Case-insensitive comparison
  • Helps prevent duplicates and makes navigation easier
Error example:
(L045) Animals category is not alphabetical order

Title format

Checks:
  • Title must use [TITLE](URL) Markdown link syntax
  • Title must not end with “API” or ”… API”
  • Title must not include TLD (like .com, .org)
Error examples:
(L123) Title syntax should be "[TITLE](LINK)"
(L124) Title should not end with "... API". Every entry is an API here!

Description requirements

Checks:
  • First character must be capitalized
  • Must not exceed 100 characters
  • Must not end with punctuation (., !, ,, etc.)
Error examples:
(L089) first character of description is not capitalized
(L090) description should not end with .
(L091) description should not exceed 100 characters (currently 125)

Auth field validation

Checks:
  • Must be one of: OAuth, apiKey, X-Mashape-Key, User-Agent, or No
  • Must be wrapped in backticks (except No)
  • Uses exact case-sensitive matching
Error examples:
(L156) auth value is not enclosed with `backticks`
(L157) Bearer is not a valid Auth option

HTTPS field validation

Checks:
  • Must be exactly Yes or No
  • Case-sensitive
Error example:
(L178) yes is not a valid HTTPS option

CORS field validation

Checks:
  • Must be exactly Yes, No, or Unknown
  • Case-sensitive
Error example:
(L199) unknown is not a valid CORS option

Table structure

Checks:
  • Each entry must have at least 5 columns (API, Description, Auth, HTTPS, CORS)
  • Each segment must start and end with exactly 1 space
  • Proper table formatting with | separators
Error examples:
(L234) entry does not have all the required columns (have 4, need 5)
(L235) each segment must start and end with exactly 1 space

Category validation

Checks:
  • Each category must have minimum 3 entries
  • Category headers must be listed in the Index section
  • Headers must use ### format
Error examples:
(L067) category header (New Category) not added to Index section
(L068) Example category does not have the minimum 3 entries (only has 2)
(L069) category header is not formatted correctly

How to run

cd scripts
python validate/format.py ../README.md

Understanding output

Success:
  • No output (script exits with code 0)
  • All checks passed
Errors found:
(L045) Animals category is not alphabetical order
(L089) first character of description is not capitalized
(L123) Title should not end with "... API". Every entry is an API here!
(L156) apikey is not a valid Auth option
Each line shows:
  • (L045) - Line number in README.md (line 46, since counting starts at 0)
  • Error description

Running tests locally

Before submitting a Pull Request, run both validation scripts:
cd scripts

# Run format validation
python validate/format.py ../README.md

# Run link validation (duplicates only, faster)
python validate/links.py ../README.md -odlc

# Run full link validation (slower, checks all URLs)
python validate/links.py ../README.md

Automated CI checks

When you open a Pull Request:
  1. GitHub Actions automatically runs all validation scripts
  2. Build status is shown in the PR
  3. Build logs show any errors found
  4. PR cannot merge until all checks pass

Viewing build results

  1. Look for the check status at the bottom of your PR
  2. Click “Details” to view full build logs
  3. Scroll to find any error messages
  4. Fix the errors and push new commits
  5. Checks will run again automatically

Common validation errors

Error: Alphabetical order

Problem: APIs not sorted alphabetically Fix: Reorder entries within the category Problem: Same URL listed multiple times Fix: Remove the duplicate entry

Error: Title ends with “API”

Problem: Title like “GitHub API” instead of “GitHub” Fix: Remove ” API” from the title

Error: Invalid Auth value

Problem: Using auth type not in allowed list Fix: Use one of: OAuth, apiKey, X-Mashape-Key, User-Agent, or No

Error: Description too long

Problem: Description exceeds 100 characters Fix: Shorten description to be more concise

Error: Missing backticks

Problem: Auth value not wrapped in backticks Fix: Change apiKey to `apiKey` (except for No)

Advanced validation

Test suite

The repository includes unit tests for validators:
  • scripts/tests/test_validate_links.py
  • scripts/tests/test_validate_format.py
These ensure the validation scripts themselves work correctly.

Contributing to validators

If you want to improve the validation tools:
  1. Fork the repository
  2. Modify scripts in scripts/validate/
  3. Add/update tests in scripts/tests/
  4. Submit a Pull Request

Learn more