Questionnaire Management
Recursive JSONB schema validation, versioning lifecycle, leaf-weight rules, and bulk ingestion.
The Questionnaire Management system is designed to handle complex, hierarchical assessment frameworks with strict mathematical integrity for scoring and longitudinal tracking.
1. Core Data Model
The system separates the identity of a questionnaire from its specific content versions.
2. Schema Architecture (JSONB)
Instead of a complex relational tree for questions and sections (which makes versioning and querying slow), we use a validated JSONB tree. This allows for recursive nesting while maintaining high performance.
Structural Rules (Recursive Hierarchy)
The "Leaf-Weight" Rule
To ensure scoring mathematical integrity, the following rules are enforced by the QuestionnaireSchemaValidator:
- Mutual Exclusivity: A section can either contain sub-sections OR questions, never both.
- Weight Placement: Weights (
number) can ONLY be assigned to "Leaf" sections (sections containing questions). - The 100% Rule: The sum of all leaf section weights within a single version must equal exactly 100.
Why? This guarantees that every question belongs to a weighted bucket, making the calculation of a normalized score (0-100) mathematically trivial and deterministic.
3. Versioning & Immutability
Questionnaires follow a strict lifecycle to ensure that historical submission data remains valid even if the questionnaire changes.
- States and Transitions: Questionnaires progress through
DRAFT,ACTIVE, andDEPRECATEDstates.DRAFT: Editable, but cannot accept submissions.ACTIVE: Accepts submissions, read-only. Only oneACTIVEversion per questionnaire at any time.DEPRECATED: Cannot accept new submissions, read-only, but historical submissions linked to it remain accessible.- Transition:
DRAFTcan bePUBLISHEDtoACTIVE. AnACTIVEversion can be manuallyDEPRECATED. Publishing a new version automaticallyDEPRECATESthe previouslyACTIVEversion.
- Single Draft Rule: Only one
DRAFTversion can exist for a givenQuestionnaireat any time, preventing conflicting edits. - Strict Incremental Versioning:
QuestionnaireVersionnumbers are strictly sequential (v1, v2, v3...), enforced by the system to prevent skipping and maintain a clear audit trail. - Submission Linking: All submissions are permanently linked to the specific
QuestionnaireVersionthey were made against, ensuring data immutability and historical accuracy. - Editing and Submissions: Only
DRAFTversions are editable. OnlyACTIVEversions accept submissions. - Historical Accessibility: Submissions linked to
DEPRECATEDversions remain fully accessible for historical analysis and comparison, queryable via registered dimensions.
4. Design Justifications
Questionnaire Versioning Decisions
- Questionnaire Status Alignment: The existing
QuestionnaireStatusenum (DRAFT,PUBLISHED,ARCHIVED) has been aligned with the new lifecycle states:DRAFT,ACTIVE,DEPRECATED.PUBLISHEDmaps toACTIVE. - Deprecation Safeguards (UI/Global Control): The UI provides warnings to administrators about the consequences of deprecating an Active version (e.g., number of existing submissions). A global activation/deactivation mechanism for active forms complements individual version states.
- Historical Data Querying (Dimension-backed): Historical submissions are queryable using a dimension-backed approach, relying on a registry of standardized dimensions. This ensures data consistency and comparability across different questionnaire versions.
- User Experience for Deprecated Versions: Users attempting to access a deprecated questionnaire version receive a clear message and are redirected to the latest
ACTIVEversion (if one exists).
Why JSONB for the Schema?
- Flexibility: Institutional questionnaires often change structure (adding sub-sections). JSONB handles this without schema migrations.
- Atomic Loading: Fetching a complete questionnaire for the UI requires one database read instead of recursive joins.
- Integrity: We use NestJS/Zod and a custom
QuestionnaireSchemaValidatorto ensure the JSON matches our strict rules before it ever hits the database.
Why Decouple Dimensions?
Dimensions (e.g., "Clarity", "Organization") are stored in a global registry. Question nodes in the JSON schema reference these by a stable dimensionCode.
- Cross-Questionnaire Analytics: This allows the system to compare "Clarity" scores across different types of questionnaires (Student Feedback vs. Peer Review).
Institutional Snapshotting
When a questionnaire is submitted, we don't just store IDs. We snapshot the current Campus, Department, and Course names.
- Justification: If a Department is renamed next year, historical feedback for "Dept A" should not retroactively move to "Dept B" in reports. It preserves the institutional state at the moment of feedback.
5. Bulk Ingestion & Orchestration
The system provides a robust orchestration layer for ingesting bulk questionnaire data from external sources (e.g., historical CSVs, external APIs).
The Ingestion Engine
The IngestionEngine processes asynchronous streams of submission data using a high-performance orchestration model:
- Bounded Concurrency: Processes multiple records simultaneously using
p-limit(default 6) to maximize throughput without overwhelming the database connection pool. - Per-Record Isolation: Each record is processed in a forked
EntityManagerand its own transaction. A failure in one record does not affect others. - Speculative Dry-Runs: Executes the complete business logic, including database constraints and triggers, but uses a custom
DryRunRollbackErrorto ensure the transaction is always rolled back. - Deduplicated Mapping: Uses
IngestionMapperServicewith a request-scopedDataLoaderto cache institutional entity lookups (Users, Courses, Semesters) across concurrent workers. - Resource Safety: Implements hard memory limits (5,000 records) and automatic backpressure if the processing queue grows too large.
Concrete Adapters (CSV & Excel)
- Streaming-first: Both adapters return
AsyncIterable<IngestionRecord>and never buffer the entire file. - Header normalization: Keys are trimmed, lowercased, stripped of non-alphanumerics (keeping
_and-), and de-duplicated with suffixes (_1,_2). - CSV configuration: Supports
delimiter,quote,escape, andseparatoroptions. - Excel configuration: Supports
sheetNameor 1-basedsheetIndexselection. - Row identification:
sourceIdentifieris 1-based for data rows (header row excluded).