Kakarla Ramcharan, Krishnan Sundar, Alla Sridhar Applied Data Science Using Pyspark: Learn the End-To-End Predictive Model-Building Cycle 9781484264997

Варианты приобретения

Цена: 7685.00р.
Кол-во:
о цене
Наличие: Отсутствует. Возможна поставка под заказ.

При оформлении заказа до: 2025-07-28
Ориентировочная дата поставки: Август-начало Сентября
При условии наличия книги у поставщика.

Добавить в корзину

в Мои желания

Автор: Kakarla Ramcharan, Krishnan Sundar, Alla Sridhar
Название: Applied Data Science Using Pyspark: Learn the End-To-End Predictive Model-Building Cycle
ISBN: 9781484264997
Издательство: Springer
Классификация:

Сбор и анализ данных

Машинное обучение

ISBN-10: 1484264991
Обложка/Формат: Paperback
Страницы: 410
Вес: 0.75 кг.
Дата издания: 08.01.2021
Язык: English
Размер: 25.40 x 17.78 x 2.26 cm
Ссылка на Издательство: Link
Поставляется из: Германии
Описание:

Chapter 1: Setting up the Pyspark Environment

Chapter Goal: Introduce readers to the PySpark environment, walk them through steps to setup the environment and execute some basic operations

Number of pages: 20

Subtopics:

1. Setting up your environment & data

2. Basic operations

Chapter 2: Basic Statistics and Visualizations

Chapter Goal: Introduce readers to predictive model building framework and help them acclimate with basic data operations

Number of pages: 30

Subtopics:

1. Basic Statistics

2. data manipulations/feature engineering

3. Data visualizations

4. Model building framework

Chapter 3: Variable Selection

Chapter Goal: Illustrate the different variable selection techniques to identify the top variables in a dataset and how they can be implemented using PySpark pipelines

Number of pages: 40

Subtopics:

1. Principal Component Analysis

2. Weight of Evidence & Information Value

3. Chi square selector

4. Singular Value Decomposition

5. Voting based approach

Chapter 4: Introduction to different supervised machine algorithms, implementations & Fine-tuning techniques

Chapter Goal: Explain and demonstrate supervised machine learning techniques and help the readers to understand the challenges, nuances of model fitting with multiple evaluation metrics

Number of pages: 40

Subtopics:

1. Supervised:

- Linear regression

- Logistic regression

- Decision Trees

- Random Forests

- Gradient Boosting

- Neural Nets

- Support Vector Machine

- One Vs Rest Classifier

- Naive Bayes

2. Model hyperparameter tuning:

- L1 & L2 regularization

- Elastic net

Chapter 5: Model Validation and selecting the best model

Chapter Goal: Illustrate the different techniques used to validate models, demonstrate which technique should be used for a particular model selection task and finally pick the best model out of the candidate models

Number of pages: 30

Subtopics:

1. Model Validation Statistics:

- ROC

- Accuracy

- Precision

- Recall

- F1 Score

- Misclassification

- KS

- Decile

- Lift & Gain

- R square

- Adj

Applied Data Science Using Pyspark: Learn the End-To-End Predictive Model-Building Cycle, Kakarla Ramcharan, Krishnan Sundar, Alla Sridhar