Misconceptions about big data Statistics and data

Misunderstandings about big data:

statistics is what has happened, but big data is often used to predict what has not happened or recommended, the two can not be equated. However, regardless of data statistics, big data worth mentioning, is to make the work more effective, so that more rational and accurate decision-making.


big data too fire, has been widely applied to all walks of life, and the last stage also has obvious signs of overheating. Big data in the end is a marketing term, or a methodology, the author of the old Lee Jung is a big data service provider of senior staff, he did the project is to analyze the data for different industries. In his view, you must first have a basic understanding of big data, that is, a lot of data does not necessarily have value". In addition, the data statistics are not equivalent to big data, data statistics and the difference between big data is artificial intelligence.

the past two years, big data has been widely applied to all walks of life, and the last stage also has obvious signs of overheating. From the CCTV Spring Festival migration figure to see Chen Yao micro-blog data from NPC and CPPCC exclaimed; big data NPC and CPPCC period, to the "Star" called the beast high collar sweater, "big data" has been pushed to a hitherto unknown height, but also from the direction of a sophisticated research has become a well known marketing vocabulary.

I’m not qualified to represent the academic circles, but not qualified to determine which is right and which is wrong. I can only work on their own experience, to talk about my eyes big data:

what is big data?

Baidu Encyclopedia of big data definition is this: big data (big data) or a huge amount of data, which is involved in the huge amount of data to be through the current mainstream software tools to capture, management, processing and finishing to become more active to help business decision-making purposes of information within a reasonable time.

Gartner gives this definition: "big data" is the need for new treatment mode in order to have more decision-making power, insight discovery and process optimization capabilities of massive, high rates of growth and diversification of information assets.

personally think that the definition of Gartner is more appropriate. The new processing mode is a key word, which is one of the most important features of the "big data", which is different from the traditional statistical analysis. The so-called "new treatment model" has two meanings:

1, due to the huge amount of data, the need for more efficient storage and processing technology, Hadoop has become a symbol of the era of big data;

2, if you think big data is equal to Hadoop, it is wrong. Hadoop is a necessary condition for the era of big data, big data there is a clear sign of data mining and artificial intelligence is closely integrated. This is my understanding of big data and now many so-called big data project >

