๐Ÿ›ก๏ธ Data Integrity Hub

Employee directory with intentional UI data corruption for data integrity testing practice.

Overview

Data Integrity Hub exposes a clean dataset of 50 employee records via API and CSV download. The web UI, however, intentionally renders corrupted versions of 3 specific records. The core practice is comparing these three data sources โ€” API JSON, CSV file, and the rendered HTML table โ€” and identifying the discrepancies.

This app is designed for data migration testing, ETL validation, and learning to write tests that catch subtle data corruption (wrong values, typos, missing rows).

Base Path
/api/datahub
Auth Header
x-api-key: <your-key>
App URL
/datahub.html
Total Records
50 employees (API) / 49 (UI)
The API and CSV are the source of truth. The UI has deliberate corruptions. Your job as a tester is to find and document all differences.

Quick Start

Step 1
Fetch API
GET /employees โ†’ 50 records JSON
โ†’
Step 2
Download CSV
GET /employees/csv โ†’ clean file
โ†’
Step 3
View UI
Open /datahub.html in browser
โ†’
Step 4
Compare
Find the 3 discrepancies
curl https://qacloud.dev/api/datahub/employees \
  -H "x-api-key: YOUR_KEY" | jq '.employees | length'
# โ†’ 50

curl https://qacloud.dev/api/datahub/employees/csv \
  -H "x-api-key: YOUR_KEY" -o employees.csv
wc -l employees.csv
# โ†’ 51 (50 rows + header)

Dataset Overview

The dataset contains 50 employees with department IDs d001โ€“d050. Each record has the following fields:

Sample Record (Clean โ€” from API)

{
  "id": "d001",
  "name": "Alice Johnson",
  "department": "Engineering",
  "role": "Senior Engineer",
  "email": "alice.johnson@example.com",
  "active": true,
  "salary": 120000,
  "hire_date": "2019-03-15"
}

Intentional Bugs

Three records are deliberately corrupted in the UI rendering. The API and CSV always return clean data for these records.

Bug 1 โ€” Status Flip
Emma Wilson (d005) โ€” active field inverted
Employee:Emma Wilson ยท ID: d005
API/CSV:active: true (currently employed)
UI shows:active: false (incorrectly shows as inactive)
Impact:Status badge in UI shows "Inactive" โ€” misleads reviewer into thinking Emma has left the company.
Bug 2 โ€” Name Typo
Elizabeth Murphy (d012) โ€” name misspelled
Employee:Elizabeth Murphy ยท ID: d012
API/CSV:"Elizabeth Murphy"
UI shows:"Elizabet Murphy" (missing 'h')
Impact:Name column has a typo. Simulates a data entry error during ETL or import step.
Bug 3 โ€” Missing Row
Benjamin Green (d029) โ€” row deleted from UI
Employee:Benjamin Green ยท ID: d029
API/CSV:Record present (50 total)
UI shows:Row omitted entirely (49 displayed)
Impact:Row count mismatch. UI table shows 49 rows instead of 50. Most visible bug โ€” a simple count check finds it.

API vs UI Diff Summary

IDNameFieldAPI / CSV ValueUI Value
d005Emma Wilsonactive truefalse
d012Elizabeth Murphyname "Elizabeth Murphy""Elizabet Murphy"
d029Benjamin Greenrow exists present (row 29)absent (deleted)

All other 47 records are identical across API, CSV, and UI.

API Reference

GET /api/datahub/employees Fetch all 50 employees as JSON

Returns the complete, uncorrupted employee list. Always 50 records.

{
  "employees": [
    {
      "id": "d001",
      "name": "Alice Johnson",
      "department": "Engineering",
      "role": "Senior Engineer",
      "email": "alice.johnson@example.com",
      "active": true,
      "salary": 120000,
      "hire_date": "2019-03-15"
    },
    ...
  ]
}

Filter by department or active status using query params: ?department=Engineering ?active=true

GET /api/datahub/employees/csv Download all 50 employees as CSV

Returns a CSV file with the header row + 50 data rows. Total line count = 51.

Response headers: Content-Type: text/csv, Content-Disposition: attachment; filename="employees.csv"

id,name,department,role,email,active,salary,hire_date
d001,Alice Johnson,Engineering,Senior Engineer,alice.johnson@example.com,true,120000,2019-03-15
d002,Bob Chen,Marketing,Marketing Analyst,bob.chen@example.com,true,85000,2020-07-22
...

This CSV is always clean โ€” no corruptions. Use it as the reference data to diff against the UI.

Test Cases

TC-DH-001 API returns exactly 50 employee records PASS

Steps: GET /employees โ†’ check employees.length === 50. Response should include all IDs d001โ€“d050.

TC-DH-002 CSV download contains 51 lines (header + 50 records) PASS

Steps: Download CSV โ†’ count lines โ†’ expect 51. Verify first line is the header row.

TC-DH-003 UI table shows only 49 rows (missing Benjamin Green) FAIL

Steps: Open the UI. Count visible rows in the employee table โ†’ expect 50, actual is 49. Verify d029 (Benjamin Green) is missing from the rendered table.

TC-DH-004 API shows Emma Wilson (d005) as active:true PASS

Steps: GET /employees โ†’ find d005 โ†’ verify active === true. This is the clean/correct value.

TC-DH-005 UI shows Emma Wilson (d005) as inactive โ€” status corruption FAIL

Steps: Open UI. Find Emma Wilson (d005). Check the Active/Inactive badge โ†’ it shows "Inactive" (active: false). The API says true โ€” this is Bug 1.

TC-DH-006 API returns correct name "Elizabeth Murphy" for d012 PASS

Steps: GET /employees โ†’ find d012 โ†’ verify name is exactly "Elizabeth Murphy" (with 'h').

TC-DH-007 UI shows "Elizabet Murphy" โ€” name typo corruption FAIL

Steps: Open UI. Find employee d012. Read the displayed name โ†’ should be "Elizabeth Murphy" but displays as "Elizabet Murphy" (Bug 2).

TC-DH-008 CSV contains all 3 corrupted records with clean values PASS

Steps: Download CSV โ†’ check that d005 shows active=true, d012 shows "Elizabeth Murphy" (correct spelling), and d029's row exists.

QA Tasks

1. Row Count Comparison
Compare row counts across all three sources: API JSON (employees.length), CSV (wc -l minus 1 for header), and DOM rows (document.querySelectorAll('tbody tr').length). Document each count and calculate the discrepancy.
2. Full Field-by-Field Diff Script
Write a test that fetches GET /employees and compares each employee's fields against the CSV. The script should output exactly 3 differences: d005 active, d012 name, d029 missing. All other records should match.
3. Status Flag Validation
Fetch all employees from the API where active=false. Confirm d005 is NOT in that list (she's active=true). Then check the UI โ€” verify d005's status badge says "Inactive" and document the discrepancy.
4. Name Exact Match Test
Search for "Elizabeth" in the UI search bar. Note whether d012 appears. Then search for "Elizabet" (missing 'h') โ€” it should find her. Document that the UI search returns the corrupted name.
5. Missing Record Detection
Call GET /employees and confirm d029 (Benjamin Green) exists in the response. Open the UI and confirm he does not appear in the table. Try searching his name โ€” verify the UI returns no results.
6. CSV vs API Consistency
Download the CSV. Parse it and compare every record to GET /employees. Expect zero differences (both API and CSV are clean). This validates that only the UI layer introduces the corruptions.
7. Department Filter Integrity
Use the department filter (if available in UI) to narrow to the departments where d005, d012, and d029 work. For each department, compare the filtered row count in the UI vs API ?department=X response count.
8. Automation Bug Report
Write a summary bug report with 3 entries (one per corruption), including: employee ID, field affected, expected value (from API), actual value (from UI), steps to reproduce, and suggested severity.