The “beers and diapers” story is an ancient one, though its illustrative power has never decreased.
Sometime ago, Walmart extensively studied their own sales data and discovered many correlations. One stood out unexpectedly like a sore thumb: on Friday afternoons, young American males who purchase diapers also grab packs of beers.
Without data mining, nobody would have asked the question to start with. But, why would beers be sold together with diapers? Many hypotheses have been made by analysts. The most common argument is that the young fathers who have to stay home to take care of their babies also grab some beer for gratification because they cannot go to bars. Others argued that young males who have kids are better planners and thus purchase beers for the weeks ahead.
Regardless, a causality was hard to establish. However, what was important was that when stores started to place their diapers close to their beer, most young American fathers easily found their little pack of indulgences near their bags of duties: sales zoomed. The profit purely came from the power of data mining (well, and the efforts of placing the diaper section close to the beer section).
In recent years, everywhere in the world, big corporations and public institutions invest billions of dollars in building data warehouses. Competent data scientists are desperately needed to solve problems ranging from pattern recognition for improving homeland security, to geospatial data analyses to ensure safe, iceberg-free navigation; from speech recognition to help disabled people be understood more easily, to deep neural networks for building an artificial health care system.
Although data science is not made for everyone, it does not hurt to do more research on this field. Unlike what people believe, data science does not require as many math skills as fields like pure math or physics. Data science is about being creative with a dataset and coming up with valuable insights or making predictions.
According to indeed.com, as of October 30th 2017, the average annual salary for a data scientist in Canada has been 95, 620$ (CAD), and the same statistic in the US has been 130, 567$ (USD) . Evidently, more experienced data scientists enjoy bigger pay checks.
Although there might still be a long way ahead, it is never too early to search which professions suit your interests and personality. Currently, there are numerous paths to becoming a data scientist. One could choose to study statistics or computer science in any post-secondary educational institution. Another approach could be taking professional training in government funded programs, which are usually provided by colleges. Moreover, many courses are offered online. Some of the best sites include Coursera, Codecademy, Dataquest, DataCamp and Kaggle (for competitions).
In brief, no paths are significantly better than others. What is important is always passion and hard work towards what you do.
Image Source: Wikimedia
Originally published in Bandersnatch Vol. 47 Issue 05 on November 8, 2017