Data Science Quiz For Humanities
R-bloggers 2025-11-22
[This article was first published on coding-the-past, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Test your skills with this interactive data science quiz covering statistics, Python, R, and data analysis.
.quiz-container { font-family: Inter, system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial; max-width: 900px; margin: 2rem auto; padding: 1.25rem; }
.meta { text-align: center; color: #555; margin-bottom: 1.25rem; }
.progress-wrap { background:#eee; border-radius:999px; overflow:hidden; height:14px; margin-bottom:1rem; box-shadow: inset 0 1px 2px rgba(0,0,0,0.03); }
.progress-bar { height:100%; width:0%; transition: width 450ms cubic-bezier(.2,.8,.2,1); background: linear-gradient(90deg,#4f46e5,#06b6d4); }
.question { background:#fbfdff; border:1px solid #eef2ff; padding:14px; border-radius:12px; margin-bottom:14px; box-shadow: 0 1px 2px rgba(13,17,25,0.03); }
.q-head { display:flex; justify-content:space-between; align-items:center; gap:12px; }
.q-num { background:#eef2ff; color:#3730a3; padding:6px 10px; border-radius:999px; font-weight:600; font-size:0.9rem; }
.options label { display:block; margin:8px 0; padding:8px 10px; border-radius:8px; cursor:pointer; transition: background 180ms, transform 120ms; }
.options input { margin-right:8px; }
.options label:hover { transform: translateY(-2px); }
.correct { background: #ecfdf5; border:1px solid #bbf7d0; }
.incorrect { background: #ffefef; border:1px solid #fca5a5; }
.muted { color:#666; font-size:0.9rem; }
.controls { display:flex; gap:12px; justify-content:flex-end; align-items:center; margin-top:12px; }
button.primary { background:#4f46e5; color:white; border:none; padding:10px 16px; border-radius:10px; cursor:pointer; font-weight:600; }
button.ghost { background:transparent; border:1px solid #e5e7eb; padding:8px 12px; border-radius:10px; cursor:pointer; }
#result { margin-top:16px; font-size:1.05rem; font-weight:700; text-align:center; }
.explanation { margin-top:8px; font-size:0.95rem; color:#0f172a; }
.fade-in { animation: fadeIn 380ms ease both; }
@keyframes fadeIn { from { opacity:0; transform: translateY(6px);} to {opacity:1; transform:none;} }
Progress
Answered 0 of 15
1
Which of the following best describes a z-score?
A measure of central tendency
The number of standard deviations a value is from the mean
The square of the correlation coefficient
A type of probability distribution
2
What is the main advantage of using tidy data principles in R?
Increased computation speed
Easier visualization and consistent analysis
Reduced memory usage
Automatically removes missing values
3
In Python, which library is most commonly used for data manipulation?
matplotlib
numpy
pandas
statsmodels
4
Which metric is best for evaluating a classification model on imbalanced data?
Accuracy
Recall
Variance
R-squared
5
In a linear regression, what does R² represent?
Slope of the regression line
Variance explained by the model
Covariance between variables
Degree of overfitting
6
In historical or humanities datasets, which challenge occurs most frequently?
Excessively large sample sizes
Perfectly standardized variable names
Missing or incomplete records
Highly structured relational databases
7
What does the
groupby() function do in pandas?
Sorts values by category
Applies aggregate operations to subsets of data
Removes duplicates
Normalizes columns
8
What is the primary purpose of cross-validation?
Increase training accuracy
Test different loss functions
Evaluate a model on unseen data to reduce overfitting
Speed up model training
9
Feature engineering refers to:
Training a model with more iterations
Preparing input variables to improve model performance
Removing outliers
Selecting the best model
10
Which visualization is most appropriate for the distribution of a continuous variable?
Bar chart
Histogram
Pie chart
Line plot
11
A z-score of +2.5 means:
The value is below the mean
The value is 2.5 SD above the mean
The value is an outlier
The standard deviation is 2.5
12
Which is an advantage of using R for statistical analysis?
Native GPU acceleration
Strong statistical libraries and ggplot2
Automatic machine learning
Faster than Python
13
Normalization in data preprocessing means:
Converting categorical data to numeric
Rescaling values to a standard range like 0–1
Detecting outliers
Filling missing values
14
Why may historical datasets be biased?
They always include all records
Selective or incomplete record-keeping
Automatic modern data collection
Perfect measurement systems
15
Which Python function can compute a z-score?
pandas.normalize()
scipy.stats.zscore()
numpy.z()
matplotlib.stats()
Submit Quiz
Try again
(function(){
const answers = {
q1: 'B', q2: 'B', q3: 'C', q4: 'B', q5: 'B',
q6: 'C', q7: 'B', q8: 'C', q9: 'B', q10: 'B',
q11: 'B', q12: 'B', q13: 'B', q14: 'B', q15: 'B'
};
const total = Object.keys(answers).length;
const form = document.getElementById('quizForm');
const submitBtn = document.getElementById('submitBtn');
const resetBtn = document.getElementById('resetBtn');
const resultEl = document.getElementById('result');
const progressBar = document.getElementById('progressBar');
const progressText = document.getElementById('progressText');
function updateProgress(){
const answered = Array.from(form.querySelectorAll('input[type=radio]'))
.filter(i => i.checked)
.map(i => i.name);
// unique question names answered
const unique = new Set(answered);
const n = unique.size;
const pct = Math.round((n/total)*100);
progressBar.style.width = pct + '%';
progressText.textContent = `Answered ${n} of ${total}`;
}
// update progress when any radio changes
form.addEventListener('change', updateProgress);
function showAnswers(){
let score = 0;
for(const q in answers){
const correct = answers[q];
const selector = `input[name="${q}"]`;
const inputs = Array.from(document.querySelectorAll(selector));
const chosen = inputs.find(i => i.checked);
inputs.forEach(i => {
const label = i.parentElement;
label.classList.remove('correct','incorrect');
// highlight correct option
if(i.value === correct){
label.classList.add('correct');
}
});
if(chosen){
if(chosen.value === correct){ score++; }
else {
// mark chosen wrong option red
chosen.parentElement.classList.add('incorrect');
}
}
}
// Disable all inputs after submission
form.querySelectorAll('input[type=radio]').forEach(i => i.disabled = true);
// show score with a friendly message
resultEl.innerHTML = `You scored <strong>${score} / ${total}</strong>.` + (score === total ? ' Brilliant! 🎉' : ' Nice attempt — review the highlighted answers.');
// Reveal short explanations (kept brief for the blog)
addExplanations();
}
function addExplanations(){
const explanations = {
q1: 'A z-score measures how many standard deviations a value is from the mean.',
q2: 'Tidy data makes it easier to visualize and analyze because each variable is a column and each observation a row.',
q3: 'pandas is the most common Python library for data manipulation and tabular data.',
q4: 'Recall is useful on imbalanced datasets because it focuses on correctly identifying the positive class.',
q5: 'R² indicates how much variance in the dependent variable is explained by the predictors.',
q6: 'Historical datasets commonly have missing or incomplete records due to preservation and collection practices.',
q7: 'groupby() groups rows by a key and allows aggregated operations (e.g., sum, mean) per group.',
q8: 'Cross-validation evaluates model performance on unseen folds to reduce overfitting.',
q9: 'Feature engineering creates and transforms variables to help models learn patterns better.',
q10: 'Histograms show the distribution of continuous variables by binning values.',
q11: 'A z-score of +2.5 is 2.5 standard deviations above the mean.',
q12: 'R has a rich set of statistical packages and expressive visualization (ggplot2).',
q13: 'Normalization rescales numeric values, commonly to 0–1, to make features comparable.',
q14: 'Bias occurs because records may be selective, incomplete, or created under historical constraints.',
q15: 'scipy.stats.zscore() is a ready-made function; you can also compute (x-mean)/std manually.'
};
for(const q in explanations){
const section = document.querySelector(`section[data-q="${q}"]`);
if(section && !section.querySelector('.explanation')){
const div = document.createElement('div');
div.className = 'explanation';
div.textContent = explanations[q];
section.appendChild(div);
}
}
}
function resetQuiz(){
// enable inputs and clear checked states
form.querySelectorAll('input[type=radio]').forEach(i => { i.checked = false; i.disabled = false; i.parentElement.classList.remove('correct','incorrect'); });
// remove explanations
form.querySelectorAll('.explanation').forEach(e => e.remove());
resultEl.textContent = '';
progressBar.style.width = '0%';
progressText.textContent = `Answered 0 of ${total}`;
}
submitBtn.addEventListener('click', function(){
// count how many answered
const answeredCount = new Set(Array.from(form.querySelectorAll('input[type=radio]')).filter(i => i.checked).map(i => i.name)).size;
if(answeredCount < total){
if(!confirm(`You have answered ${answeredCount} of ${total}. Submit anyway?`)) return;
}
showAnswers();
});
document.getElementById('submit-btn').addEventListener('click', function() {
gtag('event', 'submit_quiz', {
event_category: 'quiz',
event_label: 'data_science_quiz'
});
});
resetBtn.addEventListener('click', function(){ if(confirm('Reset the quiz and try again?')) resetQuiz(); });
// initial progress compute in case some radios are pre-selected
updateProgress();
})();
To leave a comment for the author, please follow the link and comment on their blog: coding-the-past.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.