Imaginary scene: I am a data scientist working at an online news company. The marketing manager wants to spend a budget promoting articles with high potentials. Therefore, she asks me to build a predictive model to predict if an article is going to get a good amount of shares or not.
The articles in this dataset were published by Mashable (www.mashable.com) and their content as the rights to reproduce it belongs to them. Hence, this dataset does not share the original content but some statistics associated with it. The original content is publicly accessed and retrieved using the provided URLs…
Have you ever been in a situation where you were handing money to the clerks at a supermarket only to find that the money is fake while there was a long line of people behind you waiting to check out? Or even more embarrassing, you didn’t carry other banknotes? I personally had experienced this situation one time and that embarrassment of being assumed to be an immoral cheapskate just stuck in my head for a long time. This motivated me to conduct this project, building a K-Means Clustering model to detect if a banknote is real or fake.

Imaginary situation: My manager wants to know the video game market. She has some domain knowledge in the gaming industry, but she doesn’t know how to communicate with data. Therefore, she asks me to create an easily use interactive dashboard for her to further investigate the insights herself.
To extract actionable insights from the dataset. I listed all the questions that came to mind below after assessing the dataset, and I tried to investigate all of them to find the insights:
1.For two groups of those who unsubscribed and who are paying for the service, how long did they usually stay in the service? and what was their average LTV(Life Time Value)?
2. For two groups of those who unsubscribed and who are paying for the service, what are the proportion of people used and using phone service?
3. For two groups of those who unsubscribed and who…
Many people are running social media accounts for their cute dogs. However, most of them usually aren’t running their social media accounts very well due to lacking the knowledge of what the audiences like and what factor drives their favorite and retweet counts.
Therefore, I tried to investigate these 2 questions below by analyzing the famous dog-rating Twitter account @weratedog to help those who own or intend to own a dog pet social media account to attract more audiences by choosing the right type of dog (How bored am I):
1. Which stage/size of dogs got the highest retweet counts…
上週Taiwan Startup Stadium邀請我們Rookie Fund一起參與為期三天的Term Sheet Bootcamp!因此用這篇文章分享學到的知識及當天聽到有趣的資訊。

以下將每一天的內容「大致挑出三個重點」讓不熟悉Term sheet的讀者能無負擔的輕鬆帶走知識。(因為這三天重點太多,所以本篇是濃縮挑出的內容會稍嫌雜了一點,不過相信對讀者還是有不少幫助!)

I’m a Computer Science student @DVC and a data analyst nanodegree graduate @Udacity. Connect with me on Linkedin: https://www.linkedin.com/in/yueh-han-chen/