Data Science Quiz For Humanities

R-bloggers 2025-11-22

[This article was first published on coding-the-past, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Test your skills with this interactive data science quiz covering statistics, Python, R, and data analysis.

.quiz-container { font-family: Inter, system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial; max-width: 900px; margin: 2rem auto; padding: 1.25rem; } .meta { text-align: center; color: #555; margin-bottom: 1.25rem; } .progress-wrap { background:#eee; border-radius:999px; overflow:hidden; height:14px; margin-bottom:1rem; box-shadow: inset 0 1px 2px rgba(0,0,0,0.03); } .progress-bar { height:100%; width:0%; transition: width 450ms cubic-bezier(.2,.8,.2,1); background: linear-gradient(90deg,#4f46e5,#06b6d4); } .question { background:#fbfdff; border:1px solid #eef2ff; padding:14px; border-radius:12px; margin-bottom:14px; box-shadow: 0 1px 2px rgba(13,17,25,0.03); } .q-head { display:flex; justify-content:space-between; align-items:center; gap:12px; } .q-num { background:#eef2ff; color:#3730a3; padding:6px 10px; border-radius:999px; font-weight:600; font-size:0.9rem; } .options label { display:block; margin:8px 0; padding:8px 10px; border-radius:8px; cursor:pointer; transition: background 180ms, transform 120ms; } .options input { margin-right:8px; } .options label:hover { transform: translateY(-2px); } .correct { background: #ecfdf5; border:1px solid #bbf7d0; } .incorrect { background: #ffefef; border:1px solid #fca5a5; } .muted { color:#666; font-size:0.9rem; } .controls { display:flex; gap:12px; justify-content:flex-end; align-items:center; margin-top:12px; } button.primary { background:#4f46e5; color:white; border:none; padding:10px 16px; border-radius:10px; cursor:pointer; font-weight:600; } button.ghost { background:transparent; border:1px solid #e5e7eb; padding:8px 12px; border-radius:10px; cursor:pointer; } #result { margin-top:16px; font-size:1.05rem; font-weight:700; text-align:center; } .explanation { margin-top:8px; font-size:0.95rem; color:#0f172a; } .fade-in { animation: fadeIn 380ms ease both; } @keyframes fadeIn { from { opacity:0; transform: translateY(6px);} to {opacity:1; transform:none;} } Progress
Answered 0 of 15
1
Which of the following best describes a z-score?
A measure of central tendency The number of standard deviations a value is from the mean The square of the correlation coefficient A type of probability distribution
2
What is the main advantage of using tidy data principles in R?
Increased computation speed Easier visualization and consistent analysis Reduced memory usage Automatically removes missing values
3
In Python, which library is most commonly used for data manipulation?
matplotlib numpy pandas statsmodels
4
Which metric is best for evaluating a classification model on imbalanced data?
Accuracy Recall Variance R-squared
5
In a linear regression, what does R² represent?
Slope of the regression line Variance explained by the model Covariance between variables Degree of overfitting
6
In historical or humanities datasets, which challenge occurs most frequently?
Excessively large sample sizes Perfectly standardized variable names Missing or incomplete records Highly structured relational databases
7
What does the groupby() function do in pandas?
Sorts values by category Applies aggregate operations to subsets of data Removes duplicates Normalizes columns
8
What is the primary purpose of cross-validation?
Increase training accuracy Test different loss functions Evaluate a model on unseen data to reduce overfitting Speed up model training
9
Feature engineering refers to:
Training a model with more iterations Preparing input variables to improve model performance Removing outliers Selecting the best model
10
Which visualization is most appropriate for the distribution of a continuous variable?
Bar chart Histogram Pie chart Line plot
11
A z-score of +2.5 means:
The value is below the mean The value is 2.5 SD above the mean The value is an outlier The standard deviation is 2.5
12
Which is an advantage of using R for statistical analysis?
Native GPU acceleration Strong statistical libraries and ggplot2 Automatic machine learning Faster than Python
13
Normalization in data preprocessing means:
Converting categorical data to numeric Rescaling values to a standard range like 0–1 Detecting outliers Filling missing values
14
Why may historical datasets be biased?
They always include all records Selective or incomplete record-keeping Automatic modern data collection Perfect measurement systems
15
Which Python function can compute a z-score?
pandas.normalize() scipy.stats.zscore() numpy.z() matplotlib.stats()
Submit Quiz Try again
(function(){ const answers = { q1: 'B', q2: 'B', q3: 'C', q4: 'B', q5: 'B', q6: 'C', q7: 'B', q8: 'C', q9: 'B', q10: 'B', q11: 'B', q12: 'B', q13: 'B', q14: 'B', q15: 'B' }; const total = Object.keys(answers).length; const form = document.getElementById('quizForm'); const submitBtn = document.getElementById('submitBtn'); const resetBtn = document.getElementById('resetBtn'); const resultEl = document.getElementById('result'); const progressBar = document.getElementById('progressBar'); const progressText = document.getElementById('progressText'); function updateProgress(){ const answered = Array.from(form.querySelectorAll('input[type=radio]')) .filter(i => i.checked) .map(i => i.name); // unique question names answered const unique = new Set(answered); const n = unique.size; const pct = Math.round((n/total)*100); progressBar.style.width = pct + '%'; progressText.textContent = `Answered ${n} of ${total}`; } // update progress when any radio changes form.addEventListener('change', updateProgress); function showAnswers(){ let score = 0; for(const q in answers){ const correct = answers[q]; const selector = `input[name="${q}"]`; const inputs = Array.from(document.querySelectorAll(selector)); const chosen = inputs.find(i => i.checked); inputs.forEach(i => { const label = i.parentElement; label.classList.remove('correct','incorrect'); // highlight correct option if(i.value === correct){ label.classList.add('correct'); } }); if(chosen){ if(chosen.value === correct){ score++; } else { // mark chosen wrong option red chosen.parentElement.classList.add('incorrect'); } } } // Disable all inputs after submission form.querySelectorAll('input[type=radio]').forEach(i => i.disabled = true); // show score with a friendly message resultEl.innerHTML = `You scored <strong>${score} / ${total}</strong>.` + (score === total ? ' Brilliant! &#x1f389;' : ' Nice attempt — review the highlighted answers.'); // Reveal short explanations (kept brief for the blog) addExplanations(); } function addExplanations(){ const explanations = { q1: 'A z-score measures how many standard deviations a value is from the mean.', q2: 'Tidy data makes it easier to visualize and analyze because each variable is a column and each observation a row.', q3: 'pandas is the most common Python library for data manipulation and tabular data.', q4: 'Recall is useful on imbalanced datasets because it focuses on correctly identifying the positive class.', q5: 'R² indicates how much variance in the dependent variable is explained by the predictors.', q6: 'Historical datasets commonly have missing or incomplete records due to preservation and collection practices.', q7: 'groupby() groups rows by a key and allows aggregated operations (e.g., sum, mean) per group.', q8: 'Cross-validation evaluates model performance on unseen folds to reduce overfitting.', q9: 'Feature engineering creates and transforms variables to help models learn patterns better.', q10: 'Histograms show the distribution of continuous variables by binning values.', q11: 'A z-score of +2.5 is 2.5 standard deviations above the mean.', q12: 'R has a rich set of statistical packages and expressive visualization (ggplot2).', q13: 'Normalization rescales numeric values, commonly to 0–1, to make features comparable.', q14: 'Bias occurs because records may be selective, incomplete, or created under historical constraints.', q15: 'scipy.stats.zscore() is a ready-made function; you can also compute (x-mean)/std manually.' }; for(const q in explanations){ const section = document.querySelector(`section[data-q="${q}"]`); if(section && !section.querySelector('.explanation')){ const div = document.createElement('div'); div.className = 'explanation'; div.textContent = explanations[q]; section.appendChild(div); } } } function resetQuiz(){ // enable inputs and clear checked states form.querySelectorAll('input[type=radio]').forEach(i => { i.checked = false; i.disabled = false; i.parentElement.classList.remove('correct','incorrect'); }); // remove explanations form.querySelectorAll('.explanation').forEach(e => e.remove()); resultEl.textContent = ''; progressBar.style.width = '0%'; progressText.textContent = `Answered 0 of ${total}`; } submitBtn.addEventListener('click', function(){ // count how many answered const answeredCount = new Set(Array.from(form.querySelectorAll('input[type=radio]')).filter(i => i.checked).map(i => i.name)).size; if(answeredCount < total){ if(!confirm(`You have answered ${answeredCount} of ${total}. Submit anyway?`)) return; } showAnswers(); }); document.getElementById('submit-btn').addEventListener('click', function() { gtag('event', 'submit_quiz', { event_category: 'quiz', event_label: 'data_science_quiz' }); }); resetBtn.addEventListener('click', function(){ if(confirm('Reset the quiz and try again?')) resetQuiz(); }); // initial progress compute in case some radios are pre-selected updateProgress(); })();
To leave a comment for the author, please follow the link and comment on their blog: coding-the-past.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Continue reading: Data Science Quiz For Humanities