🛡️ Data Integrity Hub

Employee directory with intentional UI data corruption for data integrity testing practice.

Overview

Data Integrity Hub exposes a clean dataset of 50 employee records via API and CSV download. The web UI, however, intentionally renders corrupted versions of 3 specific records. The core practice is comparing these three data sources — API JSON, CSV file, and the rendered HTML table — and identifying the discrepancies.

This app is designed for data migration testing, ETL validation, and learning to write tests that catch subtle data corruption (wrong values, typos, missing rows).

Base Path

/api/datahub

Auth Header

x-api-key: <your-key>

App URL

/datahub.html

Total Records

50 employees (API) / 49 (UI)

The API and CSV are the source of truth. The UI has deliberate corruptions. Your job as a tester is to find and document all differences.

Quick Start

Step 1

Fetch API

GET /employees → 50 records JSON

→

Step 2

Download CSV

GET /employees/csv → clean file

→

Step 3

View UI

Open /datahub.html in browser

→

Step 4

Compare

Find the 3 discrepancies

curl https://qacloud.dev/api/datahub/employees \
  -H "x-api-key: YOUR_KEY" | jq '.employees | length'
# → 50

curl https://qacloud.dev/api/datahub/employees/csv \
  -H "x-api-key: YOUR_KEY" -o employees.csv
wc -l employees.csv
# → 51 (50 rows + header)

Dataset Overview

The dataset contains 50 employees with department IDs d001–d050. Each record has the following fields:

id — string, format d001–d050
name — full name string
department — department name string
role — job title string
email — email address string
active — boolean, employment status
salary — integer, annual USD
hire_date — ISO date string (YYYY-MM-DD)

Sample Record (Clean — from API)

{
  "id": "d001",
  "name": "Alice Johnson",
  "department": "Engineering",
  "role": "Senior Engineer",
  "email": "alice.johnson@example.com",
  "active": true,
  "salary": 120000,
  "hire_date": "2019-03-15"
}

Intentional Bugs

Three records are deliberately corrupted in the UI rendering. The API and CSV always return clean data for these records.

Bug 1 — Status Flip

Emma Wilson (d005) — active field inverted

Employee:Emma Wilson · ID: d005

API/CSV:active: true (currently employed)

UI shows:active: false (incorrectly shows as inactive)

Impact:Status badge in UI shows "Inactive" — misleads reviewer into thinking Emma has left the company.

Bug 2 — Name Typo

Elizabeth Murphy (d012) — name misspelled

Employee:Elizabeth Murphy · ID: d012

API/CSV:"Elizabeth Murphy"

UI shows:"Elizabet Murphy" (missing 'h')

Impact:Name column has a typo. Simulates a data entry error during ETL or import step.

Bug 3 — Missing Row

Benjamin Green (d029) — row deleted from UI

Employee:Benjamin Green · ID: d029

API/CSV:Record present (50 total)

UI shows:Row omitted entirely (49 displayed)

Impact:Row count mismatch. UI table shows 49 rows instead of 50. Most visible bug — a simple count check finds it.

API vs UI Diff Summary

ID	Name	Field	API / CSV Value	UI Value
`d005`	Emma Wilson	active	true	false
`d012`	Elizabeth Murphy	name	"Elizabeth Murphy"	"Elizabet Murphy"
`d029`	Benjamin Green	row exists	present (row 29)	absent (deleted)

All other 47 records are identical across API, CSV, and UI.

API Reference

GET /api/datahub/employees Fetch all 50 employees as JSON

Returns the complete, uncorrupted employee list. Always 50 records.

{
  "employees": [
    {
      "id": "d001",
      "name": "Alice Johnson",
      "department": "Engineering",
      "role": "Senior Engineer",
      "email": "alice.johnson@example.com",
      "active": true,
      "salary": 120000,
      "hire_date": "2019-03-15"
    },
    ...
  ]
}

Filter by department or active status using query params: ?department=Engineering ?active=true

GET /api/datahub/employees/csv Download all 50 employees as CSV

Returns a CSV file with the header row + 50 data rows. Total line count = 51.

Response headers: Content-Type: text/csv, Content-Disposition: attachment; filename="employees.csv"

id,name,department,role,email,active,salary,hire_date
d001,Alice Johnson,Engineering,Senior Engineer,alice.johnson@example.com,true,120000,2019-03-15
d002,Bob Chen,Marketing,Marketing Analyst,bob.chen@example.com,true,85000,2020-07-22
...

This CSV is always clean — no corruptions. Use it as the reference data to diff against the UI.

Test Cases

TC-DH-001 API returns exactly 50 employee records PASS

Steps: GET /employees → check employees.length === 50. Response should include all IDs d001–d050.

TC-DH-002 CSV download contains 51 lines (header + 50 records) PASS

Steps: Download CSV → count lines → expect 51. Verify first line is the header row.

TC-DH-003 UI table shows only 49 rows (missing Benjamin Green) FAIL

Steps: Open the UI. Count visible rows in the employee table → expect 50, actual is 49. Verify d029 (Benjamin Green) is missing from the rendered table.

TC-DH-004 API shows Emma Wilson (d005) as active:true PASS

Steps: GET /employees → find d005 → verify active === true. This is the clean/correct value.

TC-DH-005 UI shows Emma Wilson (d005) as inactive — status corruption FAIL

Steps: Open UI. Find Emma Wilson (d005). Check the Active/Inactive badge → it shows "Inactive" (active: false). The API says true — this is Bug 1.

TC-DH-006 API returns correct name "Elizabeth Murphy" for d012 PASS

Steps: GET /employees → find d012 → verify name is exactly "Elizabeth Murphy" (with 'h').

TC-DH-007 UI shows "Elizabet Murphy" — name typo corruption FAIL

Steps: Open UI. Find employee d012. Read the displayed name → should be "Elizabeth Murphy" but displays as "Elizabet Murphy" (Bug 2).

TC-DH-008 CSV contains all 3 corrupted records with clean values PASS

Steps: Download CSV → check that d005 shows active=true, d012 shows "Elizabeth Murphy" (correct spelling), and d029's row exists.

QA Tasks

1. Row Count Comparison

Compare row counts across all three sources: API JSON (employees.length), CSV (wc -l minus 1 for header), and DOM rows (document.querySelectorAll('tbody tr').length). Document each count and calculate the discrepancy.

2. Full Field-by-Field Diff Script

Write a test that fetches GET /employees and compares each employee's fields against the CSV. The script should output exactly 3 differences: d005 active, d012 name, d029 missing. All other records should match.

3. Status Flag Validation

Fetch all employees from the API where active=false. Confirm d005 is NOT in that list (she's active=true). Then check the UI — verify d005's status badge says "Inactive" and document the discrepancy.

4. Name Exact Match Test

Search for "Elizabeth" in the UI search bar. Note whether d012 appears. Then search for "Elizabet" (missing 'h') — it should find her. Document that the UI search returns the corrupted name.

5. Missing Record Detection

Call GET /employees and confirm d029 (Benjamin Green) exists in the response. Open the UI and confirm he does not appear in the table. Try searching his name — verify the UI returns no results.

6. CSV vs API Consistency

Download the CSV. Parse it and compare every record to GET /employees. Expect zero differences (both API and CSV are clean). This validates that only the UI layer introduces the corruptions.

7. Department Filter Integrity

Use the department filter (if available in UI) to narrow to the departments where d005, d012, and d029 work. For each department, compare the filtered row count in the UI vs API ?department=X response count.

8. Automation Bug Report

Write a summary bug report with 3 entries (one per corruption), including: employee ID, field affected, expected value (from API), actual value (from UI), steps to reproduce, and suggested severity.