Modern Data Mining with Python  
A risk-managed approach to developing and deploying explainable and efficient algorithms using ModelOps (English Edition)
Published by BPB Publications
ISBN: 9789355519146
Pages: 438

EBOOK (EPUB)

ISBN: 9789355519146   Price: INR 899.00
  
"Modern Data Mining with Python" is a guidebook for responsibly implementing data mining techniques that involve collecting, storing, and analyzing large amounts of structured and unstructured data to extract useful insights and patterns. Enter into the world of data mining and machine learning. Use insights from various data sources, from social media to credit card transactions. Master statistical tools, explore data trends, and patterns. Understand decision trees and artificial neural networks (ANNs). Manage high-dimensional data with dimensionality reduction. Explore binary classification with logistic regression. Spot concealed patterns with unsupervised learning. Analyze text with recurrent neural networks (RNNs) and visuals with convolutional neural networks (CNNs). Ensure model compliance with regulatory standards. After reading this book, readers will be equipped with the skills and knowledge necessary to use Python for data mining and analysis in an industry set-up. They will be able to analyze and implement algorithms on large structured and unstructured datasets.
Rating
Description
"Modern Data Mining with Python" is a guidebook for responsibly implementing data mining techniques that involve collecting, storing, and analyzing large amounts of structured and unstructured data to extract useful insights and patterns. Enter into the world of data mining and machine learning. Use insights from various data sources, from social media to credit card transactions. Master statistical tools, explore data trends, and patterns. Understand decision trees and artificial neural networks (ANNs). Manage high-dimensional data with dimensionality reduction. Explore binary classification with logistic regression. Spot concealed patterns with unsupervised learning. Analyze text with recurrent neural networks (RNNs) and visuals with convolutional neural networks (CNNs). Ensure model compliance with regulatory standards. After reading this book, readers will be equipped with the skills and knowledge necessary to use Python for data mining and analysis in an industry set-up. They will be able to analyze and implement algorithms on large structured and unstructured datasets.
Table of contents
Table of Contents
  1. Cover
  2. Title Page
  3. Copyright Page
  4. Dedication Page
  5. Foreword
  6. About the Authors
  7. About the Reviewers
  8. Acknowledgement
  9. Preface
  10. Table of Contents
  11. 1. Understanding Data Mining in a Nutshell
    1. Introduction
    2. Structure
    3. Objectives
    4. What defines modern data mining
    5. The lifecycle: Data to insights consumption
    6. Understanding pattern recognition
      1. Significance of the human learning process
      2. The human learning process and mental models
      3. Data: The key ingredient for meaningful patterns and relationships
    7. How machines leverage data to build models
      1. Machine learning process
        1. Two dominant strategies: Classification and regression
        2. Biases and learning shortfalls
        3. Measuring learning accuracy and balancing trade-offs
        4. Can data size and sample impact learning
      2. How do humans benefit from data and learning
      3. Modern-day data mining challenges and possible remediation
    8. Conclusion
    9. Points to remember
  12. 2. Basic Statistics and Exploratory Data Analysis
    1. Introduction
    2. Structure
    3. Objectives
    4. Setting up Python 3.x
    5. Data mining and statistics
      1. Statistics: Foundation, key terms, needs, and types
    6. Descriptive statistics
      1. Graphical and non-graphical exploratory data analysis
      2. Non-graphical and graphical representation of univariate data
      3. Non-graphical representation of multivariate data
      4. Graphical representation of multivariate data
    7. Probability theory
      1. Probability distribution
    8. Inferential statistics
      1. Hypothesis testing with commonly used statistical tests
    9. Introduction to Time Series Data
    10. Exploratory data analysis: HMDA case study
    11. Conclusion
    12. Points to remember
  13. 3. Digging into Linear Regression
    1. Introduction
    2. Structure
    3. Objectives
    4. Linear regression
      1. Background
      2. Under the hood
      3. Challenges and assumptions including multi-collinearity
      4. Detailed EDA
        1. Dataset description
        2. Missing value treatment
        3. Outlier analysis
        4. Correlation
        5. Checking on the assumptions of linear regression
      5. Feature selection
      6. Regression execution and results
      7. Regression result interpretation
    5. Optimization algorithm
      1. Gradient descent
    6. Regularization
      1. Lasso regression
      2. Ridge regression
      3. Elastic-Net regression
    7. MLflow introduction: Need and implementation
      1. MLflow experiment tracking
    8. Case study
    9. Conclusion
    10. Points to remember
  14. 4. Exploring Logistic Regression
    1. Introduction
    2. Structure
    3. Objectives
    4. Logistic regression
    5. Background
    6. Under the hood
      1. Data
      2. Estimating probabilities
      3. Loss function
    7. Challenges and assumptions
    8. Logistic regression result and interpretation
    9. Model interpretability and explainability
    10. Performance metrics
    11. Model generalization
      1. K-fold cross-validation
      2. Ensemble learning
    12. Model lifecycle processes
    13. Model development process
    14. Case study: Loan repayment likelihood prediction
    15. Conclusion
    16. Points to remember
  15. 5. Decision Trees with Bagging and Boosting
    1. Introduction
    2. Structure
    3. Objectives
    4. Decision trees
      1. Background
      2. Under the hood
        1. Data
        2. Model
        3. Loss function
      3. Challenges and assumptions
      4. Decision tree result and interpretation
    5. Ensembling: Bagging, boosting, and stacking
      1. Random forest
      2. Gradient boosting
      3. Ensembling using the stacking method
    6. Conclusion
    7. Points to remember
  16. 6. Support Vector Machines and K-Nearest Neighbors
    1. Introduction
    2. Structure
    3. Objectives
    4. Classification algorithms with a twist
      1. Background
      2. Under the hood
        1. Data
        2. Model
        3. Loss function: Achieving optimal algorithmic results
    5. Challenges and assumptions
    6. Case study: Predicting customer propensity to subscribe to a term deposit
    7. Conclusion
    8. Points to remember
  17. 7. Putting Dimensionality Reduction into Action
    1. Introduction
    2. Structure
    3. Objectives
    4. Dimensionality reduction
    5. Background
    6. Under the dimensionality reduction hood
      1. Data
      2. Model: Reducing dimensions and variance
        1. Principal component analysis
        2. Linear discriminant analysis
        3. t-distributed Stochastic Neighbor Embedding
      3. Loss: Measuring Variance Reduction
    7. Challenges and assumptions
    8. Case study: Predicting loan repayment propensity using logistic regression, PCA, and LDA
    9. PCA parameters and interpretation
    10. LDA parameters and interpretation
    11. Logistic regression
    12. Conclusion
    13. Further reading
    14. Points to remember
  18. 8. Beginning with Unsupervised Models
    1. Introduction
    2. Structure
    3. Objectives
    4. Unsupervised learning
    5. Background
    6. Unsupervised learning techniques
      1. Data
      2. Model: Building meaningful clusters and profiling them
        1. K-means clustering
        2. Density-based spatial clustering of applications with noise
        3. Hierarchical clustering
      3. Loss: Efficiently achieving the optimal number of clusters
    7. Challenges and assumptions
    8. Case study: Bank customer portfolio segmentation
    9. Advanced unsupervised learning: A primer
    10. Conclusion
    11. Points to remember
  19. 9. Structured Data Classification using Artificial Neural Networks
    1. Introduction
    2. Structure
    3. Objectives
    4. Artificial neural network
    5. Background
    6. Under the hood of neural networks
      1. Data
      2. Model
      3. Loss function: Achieving optimal results
        1. Back-propagation and regularization
    7. Challenges and assumptions
    8. Case study: Explainable and Interpretable ANN Model
      1. Interpretable and explainable AI using SHAP and PiML
    9. Conclusion
    10. Points to remember
  20. 10. Language Modeling with Recurrent Neural Networks
    1. Introduction
    2. Structure
    3. Objectives
    4. Language modeling
    5. Background
    6. Under the hood of language modeling
      1. Data: From spoken languages to modeling datasets
      2. Model: The language with context
        1. Recurrent neural network
        2. Long short term memory
      3. Loss: Quest for the best model
    7. Challenges and assumptions related to text data and model
    8. Case study: Customer complaint classification explained with LIME
    9. Rise of transformers: A primer on BERT and GPT
    10. Conclusion
    11. Further reading
    12. Points to remember
  21. 11. Image Processing with Convolutional Neural Networks
    1. Introduction
    2. Structure
    3. Objectives
    4. Deep learning for computer vision tasks
    5. Background
    6. Under the hood of CNN models
      1. Data
      2. Model
      3. Loss: How to achieve optimal results
    7. Challenges and assumptions
    8. The race for the best model and transfer learning: A primer
    9. Case study: PDF document parser
    10. Conclusion
    11. Further reading
    12. Points to remember
  22. 12. Understanding Model Risk Management for Data Mining Models
    1. Introduction
    2. Structure
    3. Objectives
    4. Data mining challenges and risks
      1. Why do model risks occur
    5. Introduction to Model Risk Management
      1. Key regulatory frameworks
      2. Pillars of Model Risk Management
    6. Introduction to Model Operations
    7. ModelOps: Product first vs. model first mindset
    8. How ModelOps facilitates MRM
    9. Case study: Regulatory requirement fulfillment using MRM and ModelOps
    10. Conclusion
    11. Points to remember
  23. 13. Adopting ModelOps to Manage Model Risk
    1. Introduction
    2. Structure
    3. Objectives
    4. Model risk management for fair banking
    5. Background
    6. Case study: Fair lending model lifecycle implementation - concept to inference
      1. Fair lending model lifecycle
      2. Data
      3. Model Operations tools primer
      4. Architecting the model lifecycle using ModelOps
      5. Fair Lending Risk Assessment: The application
    7. Challenges and assumptions
    8. Future of AI and its practitioners
    9. Conclusion
    10. Further reading
    11. Points to remember
  24. Index
User Reviews
Rating