Q1: What is MLOps in machine learning?
A: MLOps is the DevOps for ML, enabling automated model deployment and lifecycle management.
Q2: What are the main stages of the ML lifecycle?
A: Data collection, preprocessing, model training, validation, deployment, and monitoring.
Q3: What is model drift?
A: It’s when a model’s performance degrades over time due to changes in data distribution.
Q4: How do you detect data drift in production?
A: Use statistical tests like KL divergence or monitoring tools like Evidently or WhyLabs.
Q5: What is the difference between AI and ML?
A: AI is the broader concept of machines simulating intelligence; ML is a subset focusing on learning from data.
Q6: What is feature engineering?
A: The process of selecting, transforming, and creating new variables to improve model performance.
Q7: What is a pipeline in MLOps?
A: A sequence of automated steps (e.g., preprocessing, training, evaluation, deployment).
Q8: Why is model monitoring important in MLOps?
A: To ensure performance and accuracy stay consistent in real-world environments.
Q9: What is hyperparameter tuning?
A: Optimizing algorithm parameters to improve model accuracy.
Q10: Name popular tools for MLOps pipelines.
A: MLflow, Kubeflow, SageMaker Pipelines, Vertex AI.
Deployment & Scalability
Q11: What is model versioning?
A: Tracking different model builds for auditability and rollback.
Q12: What is A/B testing in ML?
A: Comparing two model versions on live traffic to evaluate performance.
Q13: How do you scale a machine learning model?
A: Use distributed training, batch processing, and model optimization.
Q14: What is serverless deployment in ML?
A: Deploying models using cloud functions without managing infrastructure.
Q15: What is model serving?
A: Making a trained model available for inference via APIs.
Q16: What is the role of Kubernetes in MLOps?
A: It manages containerized ML workloads for scalability and automation.
Q17: What is edge AI?
A: Running AI models on edge devices for real-time inference.
Q18: What is the use of ONNX in ML deployment?
A: It enables model interoperability between frameworks like PyTorch and TensorFlow.
Q19: How do you secure ML models in production?
A: Use authentication, encryption, and monitor API access.
Q20: What is batch inference?
A: Running predictions on large datasets offline, as opposed to real-time.
Q21: What is precision vs recall in ML?
A: Precision measures correctness of positives; recall measures coverage of actual positives.
Q22: What is a confusion matrix?
A: A table showing TP, FP, TN, FN to evaluate classification performance.
Q23: What is AutoML?
A: Automated machine learning tools to build models with minimal coding.
Q24: What is transfer learning?
A: Leveraging a pre-trained model on a new but related task.
Q25: What are high-variance and high-bias models?
A: Variance: overfitting; Bias: underfitting.
Q26: What is cross-validation?
A: A method to ensure models generalize well to unseen data.
Q27: What is gradient descent?
A: An optimization algorithm to minimize loss in ML models.
Q28: What is the benefit of using GPUs in ML training?
A: Faster parallel processing for large datasets and deep learning models.
Q29: What is ensemble learning?
A: Combining multiple models to improve performance (e.g., random forest, boosting).
Q30: What is regularization in ML?
A: A technique to reduce overfitting by penalizing model complexity.
AI Ethics, Explainability, and Interpretability
Q31: What is explainable AI (XAI)?
A: Techniques that make model decisions understandable to humans.
Q32: What is SHAP in model interpretability?
A: A tool that assigns feature importance values for predictions.
Q33: What are ethical concerns in AI?
A: Bias, privacy, job displacement, and accountability.
Q34: What is the GDPR impact on AI?
A: Requires transparency and rights around automated decision-making.
Q35: What is fairness in ML?
A: Ensuring models don’t favor or discriminate against any group.
Q36: How to audit AI systems?
A: Use fairness metrics, bias detection tools, and third-party review.
Q37: What are adversarial attacks in ML?
A: Manipulating input data to fool AI models.
Q38: What is federated learning?
A: Training models across decentralized devices without sharing data.
Q39: What is differential privacy?
A: A technique to protect individual data in machine learning models.
Q40: Why is interpretability critical in finance/healthcare AI?
A: For compliance, trust, and legal accountability.
Tools & Technologies: ML tools, AI cloud platforms
Q41: What is MLflow used for?
A: Experiment tracking, model packaging, and lifecycle management.
Q42: What is TensorFlow Serving?
A: A flexible, high-performance serving system for ML models.
Q43: What is the use of SageMaker in MLOps?
A: It provides tools for model building, training, tuning, and deployment in AWS.
Q44: What is the role of Airflow in ML pipelines?
A: Orchestration of complex workflows, including ML tasks.
Q45: What is TFX (TensorFlow Extended)?
A: A production-ready platform for deploying ML pipelines with TensorFlow.
Q46: What is a model registry?
A: A system to store and manage ML models and their metadata.
Q47: What is experiment tracking in ML?
A: Recording model metrics, parameters, and results to improve reproducibility.
Q48: What are retraining pipelines?
A: Automated workflows to retrain models as new data becomes available.
Q49: What is the role of DataOps in MLOps?
A: Ensures reliable, automated data pipelines for training and inference.
Q50: What is the difference between online and offline learning in ML?
A: Online: learns in real-time; Offline: learns from batch data.