Q1: What is MLOps in machine learning?

A: MLOps is the DevOps for ML, enabling automated model deployment and lifecycle management.

Q2: What are the main stages of the ML lifecycle?

A: Data collection, preprocessing, model training, validation, deployment, and monitoring.

Q3: What is model drift?

A: It’s when a model’s performance degrades over time due to changes in data distribution.

Q4: How do you detect data drift in production?

A: Use statistical tests like KL divergence or monitoring tools like Evidently or WhyLabs.

Q5: What is the difference between AI and ML?

A: AI is the broader concept of machines simulating intelligence; ML is a subset focusing on learning from data.

Q6: What is feature engineering?

A: The process of selecting, transforming, and creating new variables to improve model performance.

Q7: What is a pipeline in MLOps?

A: A sequence of automated steps (e.g., preprocessing, training, evaluation, deployment).

Q8: Why is model monitoring important in MLOps?

A: To ensure performance and accuracy stay consistent in real-world environments.

Q9: What is hyperparameter tuning?

A: Optimizing algorithm parameters to improve model accuracy.

Q10: Name popular tools for MLOps pipelines.

A: MLflow, Kubeflow, SageMaker Pipelines, Vertex AI.

Deployment & Scalability

Q11: What is model versioning?

A: Tracking different model builds for auditability and rollback.

Q12: What is A/B testing in ML?

A: Comparing two model versions on live traffic to evaluate performance.

Q13: How do you scale a machine learning model?

A: Use distributed training, batch processing, and model optimization.

Q14: What is serverless deployment in ML?

A: Deploying models using cloud functions without managing infrastructure.

Q15: What is model serving?

A: Making a trained model available for inference via APIs.

Q16: What is the role of Kubernetes in MLOps?

A: It manages containerized ML workloads for scalability and automation.

Q17: What is edge AI?

A: Running AI models on edge devices for real-time inference.

Q18: What is the use of ONNX in ML deployment?

A: It enables model interoperability between frameworks like PyTorch and TensorFlow.

Q19: How do you secure ML models in production?

A: Use authentication, encryption, and monitor API access.

Q20: What is batch inference?

A: Running predictions on large datasets offline, as opposed to real-time.

Q21: What is precision vs recall in ML?

A: Precision measures correctness of positives; recall measures coverage of actual positives.

Q22: What is a confusion matrix?

A: A table showing TP, FP, TN, FN to evaluate classification performance.

Q23: What is AutoML?

A: Automated machine learning tools to build models with minimal coding.

Q24: What is transfer learning?

A: Leveraging a pre-trained model on a new but related task.

Q25: What are high-variance and high-bias models?

A: Variance: overfitting; Bias: underfitting.

Q26: What is cross-validation?

A: A method to ensure models generalize well to unseen data.

Q27: What is gradient descent?

A: An optimization algorithm to minimize loss in ML models.

Q28: What is the benefit of using GPUs in ML training?

A: Faster parallel processing for large datasets and deep learning models.

Q29: What is ensemble learning?

A: Combining multiple models to improve performance (e.g., random forest, boosting).

Q30: What is regularization in ML?

A: A technique to reduce overfitting by penalizing model complexity.

AI Ethics, Explainability, and Interpretability

Q31: What is explainable AI (XAI)?

A: Techniques that make model decisions understandable to humans.

Q32: What is SHAP in model interpretability?

A: A tool that assigns feature importance values for predictions.

Q33: What are ethical concerns in AI?

A: Bias, privacy, job displacement, and accountability.

Q34: What is the GDPR impact on AI?

A: Requires transparency and rights around automated decision-making.

Q35: What is fairness in ML?

A: Ensuring models don’t favor or discriminate against any group.

Q36: How to audit AI systems?

A: Use fairness metrics, bias detection tools, and third-party review.

Q37: What are adversarial attacks in ML?

A: Manipulating input data to fool AI models.

Q38: What is federated learning?

A: Training models across decentralized devices without sharing data.

Q39: What is differential privacy?

A: A technique to protect individual data in machine learning models.

Q40: Why is interpretability critical in finance/healthcare AI?

A: For compliance, trust, and legal accountability.

Tools & Technologies: ML tools, AI cloud platforms

Q41: What is MLflow used for?

A: Experiment tracking, model packaging, and lifecycle management.

Q42: What is TensorFlow Serving?

A: A flexible, high-performance serving system for ML models.

Q43: What is the use of SageMaker in MLOps?

A: It provides tools for model building, training, tuning, and deployment in AWS.

Q44: What is the role of Airflow in ML pipelines?

A: Orchestration of complex workflows, including ML tasks.

Q45: What is TFX (TensorFlow Extended)?

A: A production-ready platform for deploying ML pipelines with TensorFlow.

Q46: What is a model registry?

A: A system to store and manage ML models and their metadata.

Q47: What is experiment tracking in ML?

A: Recording model metrics, parameters, and results to improve reproducibility.

Q48: What are retraining pipelines?

A: Automated workflows to retrain models as new data becomes available.

Q49: What is the role of DataOps in MLOps?

A: Ensures reliable, automated data pipelines for training and inference.

Q50: What is the difference between online and offline learning in ML?

A: Online: learns in real-time; Offline: learns from batch data.