Not a summary. Not a guess. A forensic decomposition of every DSE17 question from 2012 to 2024, with the actual question text, the traps, and the solutions.
We parsed every single Paper 2 (Multiple Choice) from 2012 to 2024. Here's what DSE17 looks like in hard numbers:
Every year, HKEAA puts DSE17 questions in almost the exact same slots:
Across 13 years, every DSE17 question falls into one of 4 templates. Below, we show real HKEAA questions, walk through the solution step-by-step, and name the exact psychological trap.
Section B means the harder half of Paper 2 (Q31–Q45). Q45 is the last question — always reserved for this topic. Appeared in: 2013, 2014, 2020, 2021, 2022, 2023, 2024 (7 of 13 years).
The one rule: If every data point \(x_i\) is transformed to \(kx_i + c\), then:
• Mean shifts: \(\mu_{new} = k\mu + c\)
• Variance scales: \(\text{Var}_{new} = k^2 \cdot \text{Var}\) (the \(+c\) disappears completely)
• SD scales: \(\text{SD}_{new} = |k| \cdot \text{SD}\)
Q44 is the second-to-last question in the hard section. Appeared in: 2014, 2017, 2018, 2019, 2022, 2023, 2024 (7 of 13 years).
The one formula: \( z = \dfrac{x - \mu}{\sigma} \). HKEAA gives you enough information to set up simultaneous equations and solve for the unknowns.
Section A is the easier half (Q1–Q30). Q30 is the last question of Section A — the hardest "easy" question. Appeared in: 2012, 2013, 2016, 2017, 2022, 2023, 2024 (7 of 13 years).
The format: A dataset with unknowns (\(m\), \(n\), \(x\), \(y\)). HKEAA gives you the Mode, Median, or Mean. You must logically deduce what the unknowns are and check which statements "MUST be true."
Q29 is the second-to-last in Section A. The most straightforward template. Appeared in: 2012, 2013, 2016, 2019, 2020, 2024 (6 of 13 years).
Template 4 is the most straightforward — it's a formula plug-in. The difficulty comes only from reading the diagram correctly and not confusing Q1 with Q3.
Here's the actual data — every Q45 (Data Transformation) question from our database, showing the exact operation HKEAA used each year. Watch the dials turn:
| Year | Question Stem (Abbreviated) | Transform | What They Asked | Core Skill |
|---|---|---|---|---|
| 2013 | Multiply each \(x_i\) by 3, add 4 | y = 3x + 4 | Find new variance | Var scales by \(k^2\) |
| 2014 | Multiply by −1, add 14 | y = −x + 14 | Find new variance | \((-1)^2 = 1\) → unchanged |
| 2020 | \(\{20a+3, 20a+5, \ldots, 20a+17\}\) | Strip 20a | Find the variance | Translation → strip, then compute |
| 2021 | Arithmetic sequence \(T(1)..T(49)\) vs \(T(51)..T(99)\) | Shift by 50d | "Must be true" on median, range, variance | Translation invariance on AP |
| 2022 | \(S_1 = \{d-6,\ldots\}\) vs \(S_2 = \{d-7,\ldots\}\) | Strip d | "Must be true" on mean, SD, IQR | Translation → compare two sets |
| 2023 | \(\{1-9n, 3-9n, \ldots, 7-9n\}\) | Strip 9n | "Must be true" on SD, median, range | Translation + edge-case reasoning |
| 2024 | Var(\(x\)) = 16. Find SD of \(9x-5\) | y = 9x − 5 | Find new SD | SD scales by |k|, convert Var↔SD |
The underlying concept never changes: adding a constant doesn't affect spread; multiplying by \(k\) scales SD by \(|k|\) and variance by \(k^2\). HKEAA varies only the packaging: sometimes a direct calculation, sometimes "Which must be true?", sometimes comparing two datasets, sometimes wrapping it in arithmetic sequences. The cognitive skill is identical every single year.
We don't just "save questions." We decompose every question into structured data. Here are three real entries from our database, showing exactly what we capture:
| Field | Purpose | Example |
|---|---|---|
| skillsTested | The irreducible cognitive skill(s). This powers our diagnostic engine — if a student fails, we know exactly what to re-teach. | ["scaling-effect", "translation-invariance"] |
| prerequisites | Directed edges in the skill graph. If they fail "scaling-effect", fall back to "variance-formula" → "sd-formula". | ["variance-formula", "sd-formula"] |
| toAce | The exact, minimal solution path. This is what a perfect student thinks. | "Var=16 → SD=4. SD(9x−5) = 9×4 = 36." |
| errorTraps | HKEAA's psychological traps. We use these to auto-generate distractors (wrong MC options) that catch real misconceptions. | "Include −5 in scaling: 9(4)−5=31" |
These are the 11 irreducible skills we extracted. Every DSE17 question from 2012–2024 can be solved using some combination of these.
| Skill | Level | What It Means | Tested |
|---|---|---|---|
| variance-formula | FOUND. | Calculate \(\text{Var} = \Sigma(x_i - \bar{x})^2 / n\) from raw data. Must find mean first. | 5× |
| sd-formula | FOUND. | \(\text{SD} = \sqrt{\text{Var}}\). Convert between variance and standard deviation. | 3× |
| translation-invariance | CORE | Adding a constant doesn't change spread: \(\text{Var}(x+c) = \text{Var}(x)\). Only location shifts. | 5× |
| scaling-effect | CORE | Multiplying by \(k\) scales SD by \(|k|\) and Var by \(k^2\). Combined: \(\text{SD}(kx+c) = |k| \cdot \text{SD}(x)\). | 1× |
| box-whisker-reading | FOUND. | Read min, Q1, median, Q3, max from a box-and-whisker diagram. Calculate IQR and range. | 1× |
| iqr-definition | FOUND. | \(\text{IQR} = Q3 - Q1\). Measures the spread of the middle 50% of data. | 2× |
| standard-score | CORE | \(z = (x - \bar{x}) / \text{SD}\). Measures how many SDs a value is from the mean. | 1× |
| central-tendency-combined | CORE | Reason about mean, median, and mode simultaneously. Given constraints, deduce valid datasets. | 3× |
| dataset-comparison | ADV. | Compare two related datasets (e.g. \(S_1\) vs \(S_2\) with shared parameter \(d\)) on mean, SD, IQR. | 1× |
| arithmetic-sequence-statistics | ADV. | Exploit AP structure (\(T(n) = a + (n-1)d\)) to deduce statistical properties without brute-force computation. | 1× |
| constrained-dataset-reasoning | ADV. | Given unknowns + constraints (mean = k, mode = m), enumerate ALL valid datasets and check which properties MUST hold. | 2× |
translation-invariance + variance-formula, they can already attempt 5 out of 9 recent questions. Add central-tendency-combined and they cover 8 out of 9.
These questions weren't written by a tutor or pulled from a textbook. They were generated by our engine, which turns the same "dials" HKEAA uses to produce new, mathematically sound questions with deliberate trap options.
DSE17 is one topic. The same methodology applies to all 20 DSE Math topics. Here's the product roadmap this data infrastructure makes possible.
We've proven we can decompose every past paper question into atomic skills and generate mathematically sound new questions. Scale this to all 20 topics:
Because every question is tagged with skills and prerequisites, a student's wrong answers map directly to their skill gaps — not just "weak at statistics" but "specifically fails at constrained-dataset-reasoning because central-tendency-combined isn't solid yet."
constrained-dataset-reasoning gap → Falls back to central-tendency-combined → Drills mode/median constraint with 3 easier questions → Retests with a harder variant → Only advances when solid.
This is the difference between a tutor saying "practice more stats" and a system saying "you need 3 more reps on this exact prerequisite before we move on."
With 13 years of mutation data, we can model HKEAA's question generation patterns. For any topic, we know:
Combining the question engine + diagnostic engine, we can build a fully adaptive practice product:
Every year HKEAA releases a new paper, we add ~3–4 new questions to the bank, refine our mutation models, and update our prediction engine. The system gets smarter every July. No tutor center can replicate this — they'd need to rebuild the entire decomposition infrastructure from scratch.
The data asset built here — tagged atomic skills across all 20 topics, with prerequisite graphs and error trap catalogs — is the defensible foundation that everything else sits on.