会员   密码 您忘记密码了吗?
1,574,861 本书已上架      购物流程 | 常见问题 | 联系我们 | 关于我们 | 用户协议

有店 App


当前分类

商品分类

浏览历史

当前位置: 首页 > 简体书 > 數據科學原理(影印版)
數據科學原理(影印版)
上一张
下一张
prev next

數據科學原理(影印版)

作者: (美)SINAN OZDEMIR
出版社: 東南大學出版社
出版日期: 2017-10-01
商品库存: 点击查询库存
以上库存为海外库存属流动性。
可选择“空运”或“海运”配送,空运费每件商品是RM14。
配送时间:空运约8~12个工作天,海运约30个工作天。
(以上预计配送时间不包括出版社库存不足需调货及尚未出版的新品)
定价:   NT552.00
市场价格: RM99.24
本店售价: RM88.32
促销价: RM87.33
剩余时间: 请稍等, 正在载入中...
购买数量:
collect Add to cart Add booking
详细介绍 商品属性 商品标记
內容簡介

本書旨在幫助你將數學、編程和商業分析這三者融會貫通。有了這本書,在面對復雜的問題時,無論是抽象和原始的數據統計,還是可實施的理念,你都會充滿自信。我們采用了一種獨特的方法來建立起數學和計算機科學之間的橋梁,你會在這次令人興奮的學習之旅中成長為一名數據科學家。

從清洗和准備數據開始,然后到給出有效的數據挖掘策略和技術,你會經歷數據科學的整個流程,建立起數據科學的各個組成部分是如何相互協作的宏觀概念,學習基本的數學和統計學知識以及一些目前由數據科學家和分析師用到的偽代碼。

除此之外,你還將掌握機器學習,了解一些有用的統計模型,這些模型能夠幫助你控制和處理很密集的數據集,學會如何創建出能股表達數據意圖的可視化方法。


目錄

Preface

Chapter 1: How to Sound Like a Data Scientist

What is data science?

Basic terminology

Why data science?

Example - Sigma Technologies

The data science Venn diagram

The math

Example - spawner-recruit models

Computer programming

Why Python?

Python practices

Example of basic Python

Domain knowledge

Some more terminology

Data science case studies

Case study - automating government paper pushing

Fire all humans, right?

Case study - marketing dollars

Case study - what’’s in a job description?

Summary

Chapter 2: Types of Data

Flavors of data

Why look at these distinctions?

Structured versus unstructured data

Example of data preprocessing

Word/phrase counts

Presence of certain special characters

Relative length of text

Picking out topics

Quantitative versus qualitative data

Example - coffee shop data

Example - world alcohol consumption data

Digging deeper

The road thus far

The four levels of data

The nominal level

Mathematical operations allowed

Measures of center

What data is like at the nominal level

The ordinal level

Examples

Mathematical operations allowed

Measures of center

Quick recap and check

The interval level

Example

Mathematical operations allowed

Measures of center

Measures of variation

The ratio level

Examples

Measures of center

Problems with the ratio level

Data is in the eye of the beholder

Summary

Chapter 3: The Five Steps of Data Science

Introduction to Data Science

Overview of the five steps

Ask an interesting question

Obtain the data

Explore the data

Model the data

Communicate and visualize the results

Explore the data

Basic questions for data exploration

Dataset 1 - Yelp

Dataframes

Series

Exploration tips for qualitative data

Dataset 2 - titanic

Summary

Chapter 4: Basic Mathematics

Mathematics as a discipline

Basic symbols and terminology

Vectors and matrices

Quick exercises

Answers

Arithmetic symbols

Summation

Proportional

Dot product

Graphs

Logarithms/exponents

Set theory

Linear algebra

Matrix multiplication

How to multiply matrices

Summary

Chapter 5: Impossible or Improbable - A Gentle Introduction to Probability

Basic definitions

Probability

Bayesian versus Frequentist

Frequentist approach

The law of large numbers

Compound events

Conditional probability

The rules of probability

The addition rule

Mutual exclusivity

The multiplication rule

Independence

Complementary events

A bit deeper

Summary

Chapter 6: Advanced Probability

Collectively exhaustive events

Bayesian ideas revisited

Bayes theorem

More applications of Bayes theorem

Example - Titanic

Example - medical studies

Random variables

Discrete random variables

Types of discrete random variables

Summary

Chapter 7: Basic Statistics

What are statistics?

How do we obtain and sample data?

Obtaining data

Observational

Experimental

Sampling data

Probability sampling

Random sampling

Unequal probability sampling

How do we measure statistics?

Measures of center

Measures of variation

Definition

Example - employee salaries

Measures of relative standing

The insightful part - correlations in data

The Empirical rule

Summary

Chapter 8: Advanced Statistics

Point estimates

Sampling distributions

Confidence intervals

Hypothesis tests

Conducting a hypothesis test

One sample t-tests

Example of a one sample t-tests

Assumptions of the one sample t-tests

Type I and type II errors

Hypothesis test for categorical variables

Chi-square goodness of fit test

Chi-square test for association/independence

Summary

Chapter 9: Communicating Data

Why does communication matter?

Identifying effective and ineffective visualizations

Scatter plots

Line graphs

Bar charts

Histograms

Box plots

When graphs and statistics lie

Correlation versus causation

Simpson’’s paradox

If correlation doesn’’t imply causation, then what does?

Verbal communication

It’’s about telling a story

On the more formal side of things

The whylhowlwhat strategy of presenting

Summary

Chapter 10: How to Tell If Your Toaster Is Learning - Machine Learning Essentials

What is machine learning?

Machine learning isn’’t perfect

How does machine learning work?

Types of machine learning

Supervised learning

It’’s not only about predictions

Types of supervised learning

Data is in the eyes of the beholder

Unsupervised learning

Reinforcement learning

Overview of the types of machine learning

How does statistical modeling fit into all of this?

Linear regression

Adding more predictors

Regression metrics

Logistic regression

Probability, odds, and log odds

The math of logistic regression

Dummy variables

Summary

Chapter 11: Predictions Don’’t Grow on Trees - or Do They?

Na’’fve Bayes classification

Decision trees

How does a computer build a regression tree?

How does a computer fit a classification tree?

Unsupervised learning

When to use unsupervised learning

K-means clustering

Illustrative example - data points

Illustrative example - beer!

Choosing an optimal number for K and cluster validation

The Silhouette Coefficient

Feature extraction and principal component analysis

Summary

Chapter 12: Beyond the Essentials

The bias variance tradeoff

Error due to bias

Error due to variance

Two extreme cases of bias/variance tradeoff

Underfitting

Overfitting

How bias/variance play into error functions

K folds cross-validation

Grid searching

Visualizing training error versus cross-validation error

Ensembling techniques

Random forests

Comparing Random forests with decision trees

Neural networks

Basic structure

Summary

Chapter 13: Case Studies

Case study 1 - predicting stock prices based on social media

Text sentiment analysis

Exploratory data analysis

Regression route

Classification route

Going beyond with this example

Case study 2 - why do some people cheat on their spouses?

Case study 3 - using tensorflow

Tensorflow and neural networks

Summary

Index