文章分类 - 数据挖掘
摘要:We all have words we love to use, and that we perhaps use too much. As an example: I have a tendency to use the same transitional statements, to the p
阅读全文
摘要:In this blog post I will discuss missing data imputation and instrumental variables regression. This is based on a short presentation I will give at m
阅读全文
摘要:The larger and more complex the business the more metrics and dimensions. One day you understand that it is impossible to track them with only your ey
阅读全文
摘要:Since I migrated my blog from Github Pages to blogdown and Netlify, I wanted to start migrating (most of) my old posts too - and use that opportunity
阅读全文
摘要:Time series prediction (forecasting) has experienced dramatic improvements in predictive accuracy as a result of the data science machine learning and
阅读全文
摘要:Today I saw this tweet on my timeline: For those of us that just can't wait until RStudio officially supports parallel purrr in #rstats, boy have I go
阅读全文
摘要:本文作者: 陈思多,Camera360 大数据与增长业务 VP 。本文根据 TGO 鲲鹏会成都分会活动演讲整理,原文来自公众号“ TGO鲲鹏会”,授权转载。 大家好,我是 Camera360 大数据与增长业务 VP 陈思多。Camera360 是由成都品果科技出品的一款基于 iOS 、Windows
阅读全文
摘要:LightGBM参数列表 建议大家在使用LightGBM前,先仔细阅读参数介绍,毕竟LightGBM还能实现很多有趣的算法如随机森林,dart以及goss,以及众多使用辅助功能。 参数介绍传送门如下: https://github.com/Microsoft/LightGBM/blob/master
阅读全文
摘要:Introduction The key to getting better at deep learning (or many fields) is practice. Practice on variety of problems – from image processing to speec
阅读全文
摘要:Introduction Deep Learning at scale is disrupting many industries by creating chatbots and bots never seen before. On the other hand, a person just st
阅读全文
摘要:Previously we’ve covered the basics of exogenous variables in smooth functions. Today we will go slightly crazy and discuss automatic variables select
阅读全文
摘要:In this blog post, I am going to train a random forest on census data from the US to predict the probability that someone is looking for a job. To thi
阅读全文
摘要:In order to stay up to date, I try to follow Jeremy Howard on a regular basis. In one of his recent videos, he shows how to use embeddings for categor
阅读全文
摘要:BNOSAC is happy to announce the release of the udpipe R package (https://bnosac.github.io/udpipe/en) which is a Natural Language Processing toolkit th
阅读全文
摘要:“What does the world outside your head really ‘look’ like? Not only is there no color, there’s also no sound: the compression and expansion of air is
阅读全文
摘要:In a number of upcoming posts, I'll be analyzing an interesting dataset I found on Kaggle. The dataset contains information on 18,393 music reviews fr
阅读全文
摘要:Introduction Market Basket Analysis or association rules mining can be a very useful technique to gain insights in transactional data sets, and it can
阅读全文
摘要:At a glance: I explore half a million rows of disaggregated crash data for New Zealand, and along the way illustrate geo-spatial projections, maps, fo
阅读全文
摘要:One of the assumptions of Classical Linear Regression Model is that there is no exact collinearity between the explanatory variables. If the explanato
阅读全文
摘要:In the last post, we focused on the preparation of a tidy dataset describing consumer perceptions of beverages. In this post, I'll describe some analy
阅读全文