Skeleton of data science for beginners | A Quick Intro Useful Info

Overview of Data Science for beginners

This post covers the overview of data science. This is useful for Beginners who have interested in the information at the base level. This is just a skeleton of data science for beginners.

What is Data Science? For Beginners

"Data Science uses scientific methodologies or some algorithms to extract the knowledge from structured or unstructured data" this is what Wikipedia's definition. For Instance, we are daily buying groceries from supermarkets. We are getting the bills for all the items. Are we gathering all the bills and saving? 

A big "No".

What if we are collecting and consolidating all the bills for one year and analyze one day? This is called data science.

What is Data Science?

Process of Extracting Knowledge

You are finding something useful knowledge when you are analyzing all your bills, which reduces your grocery expense in the future. This is a kind of learning from the past and applying in the current and future. Is it so?


Let's say, As per your purchase history, you had purchased vegetable A mostly in a year that costs more in a particular season only. Using scientific methods, you can find out this information and you can avoid buying vegetable A in that season or reduce the amount of buying.  The same situation currently happening in our big data world. If you take any servers in the world, it has trillion megabytes of data and it is still increasing. We have enough amount of data to analyze anything and predict anything in the world. 
Now It's time to become a data scientist and move on.

Why market needs Data Science?

Market prediction is always important for market growth.  Banking sectors showing more interests to analyze customer's transactions and find the interests of customers and future predictions. That is why all the Fintech companies also trying to capture customer's transaction. Apps such as GPay, Paytm, Amazon are the examples. You can see a lot of offers, cashback, scratch cards amount offered in the applications to pull the customers because they need data for the market prediction.

It entered into many areas including medical, health cares, etc.

Top Companies Hiring Data Scientists

Most of the companies need them. Here the list of top companies that offering jobs in data science.

  • Search Engine Legends Google, Yahoo, Bing, Ask.
  • Social Network Companies Facebook, Twitter, LinkedIn, Instagram, Tumblr.
  • Fintech companies such as Amazon, Apple, Paypal, eBay, EMC, Bank of America, GE Captial, Capital one.

Other than above still other service-based companies also still hiring data scientists.

You can check the complete lists of companies hiring data scientists.

Data Science Salary

When we come to the salary part, it always depends on some factors.

  • The main factor is the company offering jobs. Salary levels highly depend on company size and industry type. Top companies pay more than medium level or service-based companies.
  • Experience Level will decide the salary and hike in salary.
  • Job Location is also the main factor which may show some variation in salary.
  • Educational background will play some role while deciding the salary in interviews.
As per the payscale, the salary for a data scientist at various levels in India is given below.
  • At the entry-level, average data scientist's annual CTC across India is Rs 540,449.
  • A mid-level data scientist can get average annual CTC about Rs 1,012,298 across India.
  • Experienced data scientist getting Rs 1,731,805 across India.
Check more data science salary details.

Data Science Books and Tutorials

A lot of beginners tutorials and guides for data science is available in digital formats and books. Nowadays High interactive online tutorials are offered by many websites. Everyone has their own learning curves and styles. That is why reading books is very important so that you can think like others who read the book.

Must-read books for Data Scientists
  • R Programming For Dummies - by Andrie de Vries
  • Python Data Science Handbook: Essential Tools for Working with Data - by Jake VanderPlas 
  • Business Analytics: The Science of Data-Driven Decision Making - by U Dinesh Kumar
  • What Is Data Science? - by Mike Loukides
  • Introducing Data Science: Big Data, Machine Learning, and More Using Python Tools - by Davy Cielen 
  • Numsense! Data Science for the Layman: No Math Added - By Annalyn Ng
  • Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data - EMC
I have read a post regarding data science books on website analyticsvidhya. 
It gives me a good view and what are the concepts to be learned before diving into the topic. It shows what are the books to read for every data scientist. Here is the referral link.

Other than books, high interactive tutorials are available for data science. Below are the websites that offer free tutorials for beginners.

Data Science Must have Skills

If you are going to be a data scientist, you must have below skills and prepare yourself and get practice these skills. You should not avoid such kind of skills.
  • Basic Computer Science
  • Python or R Programming
  • Hadoop
  • SQL Database and Coding
  • Apache Spark
  • AI Machine Learning
  • Logical and Analytical Thinking
  • Business Oriented thinking
  • Teamwork and Communication Skills

43 Tool names Data Scientists should aware

Every Data scientist should aware of these tool's names. So that can explore options and use it proper situations. Every tool has its own features and behaviors. 
  1. Apache Spark
  2. BigML
  3. Bokeh
  4. Cascading
  5. Clojure
  6. D3.js
  7. DataRobot
  8. Excel
  9. Feature Labs
  10. ForecastThis
  11. Fusion Tables
  12. Gawk
  13. ggplot2
  14. GraphLab Create
  15. IPython
  16. Java
  17. Jupyter
  18. KNIME Analytics Platform
  19. Logical Glue
  20. MATLAB
  21. Matplotlib
  22. MLBase
  23. MySQL
  24. Narrative Science
  25. NetworkX
  26. NLTK
  27. NumPy
  28. Octave
  29. OpenRefine
  30. pandas
  31. Pxyll.com
  32. RapidMiner
  33. Redis
  34. RStudio
  35. SAS
  36. Scala
  37. Scikit-learn
  38. SciPy
  39. Shiny
  40. Tableau
  41. TensorFlow
  42. TIBCO Spotfire
  43. Weka

Post a Comment

Previous Post Next Post